Apache Spark Archives - insideAI News https://insideainews.com/tag/apache-spark/ Illuminating AI's Frontiers: Your Go-To News Destination. Wed, 19 Oct 2022 15:48:55 +0000 en-US hourly 1 https://wordpress.org/?v=6.6.1 https://insideainews.com/wp-content/uploads/2024/06/iain-favicon.png Apache Spark Archives - insideAI News https://insideainews.com/tag/apache-spark/ 32 32 136462205 Databricks Announces Major Contributions to Flagship Open Source Projects https://insideainews.com/2022/07/02/databricks-announces-major-contributions-to-flagship-open-source-projects/ https://insideainews.com/2022/07/02/databricks-announces-major-contributions-to-flagship-open-source-projects/#respond Sat, 02 Jul 2022 13:00:00 +0000 https://insidebigdata.com/?p=29715 Databricks announced that the company will contribute all features and enhancements it has made to Delta Lake to the Linux Foundation and open source all Delta Lake APIs as part of the Delta Lake 2.0 release. In addition, the company announced MLflow 2.0, which includes MLflow Pipelines, a new feature to accelerate and simplify ML model deployments. Finally, the company introduced Spark Connect, to enable the use of Spark on virtually any device, and Project Lightspeed, a next generation Spark Structured Streaming engine for data streaming on the lakehouse. ]]> https://insideainews.com/2022/07/02/databricks-announces-major-contributions-to-flagship-open-source-projects/feed/ 0 29715 StreamSets Launches StreamSets Transformer https://insideainews.com/2019/09/15/streamsets-launches-streamsets-transformer/ https://insideainews.com/2019/09/15/streamsets-launches-streamsets-transformer/#respond Sun, 15 Sep 2019 20:00:55 +0000 https://insidebigdata.com/?p=23254 StreamSets, Inc., provider of the DataOps platform for modern data integration, released StreamSets® Transformer, a simple-to-use, drag-and-drop UI tool to create native Apache Spark applications. Designed for a wide range of users — even those without specialized skills — StreamSets Transformer enables the creation of pipelines for performing ETL, stream processing and machine-learning operations. Now, data engineers, scientists, architects and operators gain deep visibility into the execution of Apache Spark while broadening usage across the business.]]> https://insideainews.com/2019/09/15/streamsets-launches-streamsets-transformer/feed/ 0 23254 State of the Art Natural Language Processing at Scale https://insideainews.com/2018/07/05/state-art-natural-language-processing-scale/ https://insideainews.com/2018/07/05/state-art-natural-language-processing-scale/#respond Thu, 05 Jul 2018 15:30:11 +0000 https://insidebigdata.com/?p=20691 https://insideainews.com/2018/07/05/state-art-natural-language-processing-scale/feed/ 0 20691 Databricks Partners with RStudio To Increase Productivity of Data Science Teams https://insideainews.com/2018/06/29/databricks-partners-rstudio-increase-productivity-data-science-teams/ https://insideainews.com/2018/06/29/databricks-partners-rstudio-increase-productivity-data-science-teams/#respond Fri, 29 Jun 2018 15:30:24 +0000 https://insidebigdata.com/?p=20671 Databricks, a leader in unified analytics and founded by the original creators of Apache Spark™, announced a partnership with RStudio, providers of a free and open-source integrated development environment for R, to increase the productivity of data science teams. The partnership will allow the two companies to seamlessly integrate Databricks’ Unified Analytics Platform with the RStudio Server, simplifying R programming on big data. ]]> https://insideainews.com/2018/06/29/databricks-partners-rstudio-increase-productivity-data-science-teams/feed/ 0 20671 Apache Spark 2.0: A Deep Dive Into Structured Streaming https://insideainews.com/2018/05/28/apache-spark-2-0-deep-dive-structured-streaming/ https://insideainews.com/2018/05/28/apache-spark-2-0-deep-dive-structured-streaming/#respond Mon, 28 May 2018 15:30:18 +0000 https://insidebigdata.com/?p=20483 In this talk, Tathagata Das takes a deep dive into the concepts and the API and show how this simplifies building complex “Continuous Applications”. Tathagata is an Apache Spark Committer and a member of the PMC. He’s the lead developer behind Spark Streaming, and is currently employed at Databricks.]]> https://insideainews.com/2018/05/28/apache-spark-2-0-deep-dive-structured-streaming/feed/ 0 20483 Top 5 Mistakes When Writing Spark Applications https://insideainews.com/2018/01/07/top-5-mistakes-writing-spark-applications/ https://insideainews.com/2018/01/07/top-5-mistakes-writing-spark-applications/#respond Sun, 07 Jan 2018 16:30:46 +0000 https://insidebigdata.com/?p=19733 In the presentation below from Spark Summit 2016, Mark Grover goes over the top 5 things that he's seen in the field that prevent people from getting the most out of their Spark clusters. When some of these issues are addressed, it is not uncommon to see the same job running 10x or 100x faster with the same clusters, the same data, just a different approach.]]> https://insideainews.com/2018/01/07/top-5-mistakes-writing-spark-applications/feed/ 0 19733 The Data Scientist’s Guide to Apache Spark https://insideainews.com/2017/12/27/data-scientists-guide-apache-spark/ https://insideainews.com/2017/12/27/data-scientists-guide-apache-spark/#respond Wed, 27 Dec 2017 16:30:17 +0000 https://insidebigdata.com/?p=19673 Looking to dive deeper into the more cutting edge machine learning use cases in Apache Spark? To successfully use Spark’s advanced analytics capabilities including large scale machine learning and graph analysis, check out The Data Scientist’s Guide to Apache Spark, from our friends over at Databricks.]]> https://insideainews.com/2017/12/27/data-scientists-guide-apache-spark/feed/ 0 19673 The Data Scientist’s Guide to Apache Spark™ https://insideainews.com/white-paper/data-scientists-guide-apache-spark/ https://insideainews.com/white-paper/data-scientists-guide-apache-spark/#respond Tue, 26 Dec 2017 17:53:38 +0000 https://insidebigdata.com/?post_type=wpdmpro&p=19676 For data scientists looking to apply Apache Spark’s advanced analytics techniques and deep learning models at scale, Databricks is happy to provide The Data Scientist’s Guide to Apache Spark. Download this eBook to: Learn the fundamentals of advanced analytics and receive a crash course in machine learning. Get a deep dive on MLlib, the primary […]]]> 0 19676 Databricks Launches Delta To Combine the Best of Data Lakes, Data Warehouses and Streaming Systems https://insideainews.com/2017/10/26/databricks-launches-delta-combine-best-data-lakes-data-warehouses-streaming-systems/ https://insideainews.com/2017/10/26/databricks-launches-delta-combine-best-data-lakes-data-warehouses-streaming-systems/#respond Thu, 26 Oct 2017 17:00:07 +0000 https://insidebigdata.com/?p=19213 https://insideainews.com/2017/10/26/databricks-launches-delta-combine-best-data-lakes-data-warehouses-streaming-systems/feed/ 0 19213 Apache Spark Expands With Cypher, Neo4j’s ‘SQL For Graphs,’ Adds Declarative Graph Querying https://insideainews.com/2017/10/24/apache-spark-expands-cypher-neo4js-sql-graphs-adds-declarative-graph-querying/ https://insideainews.com/2017/10/24/apache-spark-expands-cypher-neo4js-sql-graphs-adds-declarative-graph-querying/#respond Tue, 24 Oct 2017 17:00:57 +0000 https://insidebigdata.com/?p=19180 https://insideainews.com/2017/10/24/apache-spark-expands-cypher-neo4js-sql-graphs-adds-declarative-graph-querying/feed/ 0 19180