September 27, 2016

Survey: Spark Going ‘Mainstream’

George Leopold

That rumbling sound you hear is Apache Spark entering production deployments in public clouds along with surging use of the cluster-computing framework’s streaming and machine learning capabilities, according to a new vendor survey that also found more diverse users and use cases.

Databricks Inc., the San Francisco-based startup behind Apache Spark, released survey results on Tuesday (Sept. 27) revealing steady momentum as the Spark user community more than tripled over the past year to 225,000 members.

“The results indicate that Spark has moved well beyond the early-adopter phase at high-tech companies and is now mainstream in large data-driven enterprises,” the startup asserted.

As developer participation soared, the Databricks poll of 900 organizations found that Spark deployments in the public cloud are surging as more industries shift to cloud computing. For example, the survey found that public cloud deployments of Spark jumped 10 percent over last year to 61 percent in 2016. Meanwhile, Spark deployments continue to drop for on-premises cluster managers, the survey found.

Spark also is spurring the surge in fast data analytics, with more than half of more than 1,600 respondents pointing to data streaming as a key component for deploying real-time streaming and analytics platforms. While production use of Spark streaming surged 57 percent over the past year, the adoption rate for Spark’s machine learning library, MLlib, also grew 38 percent year-on-year.

Along with Spark-based streaming and machine learning applications entering production, the Databricks survey found that deployments of other Spark components such as DataFrames more than doubled over the last year. DataFrames is a distributed collection of data organized in named columns. The survey found that production deployments rose to 38 percent over the last year.

Meanwhile Spark SQL deployment rose 16 percent year-on-year to 40 percent of those polled.

Based on its survey results, Databricks said it expects Spark momentum to continue building as a diverse set of new users embraces the data-processing engine. One reason is simplicity, a characteristic found lacking in another recent industry survey that cited “inflexibility” in current data analytics infrastructure as a key reason for many failed big data projects.

Hence, Databricks executives noted that ease-of-use along with better performance headed the list of key Spark features most often cited by users. They also cited accessibility of common programming languages supported by Spark, including R and SQL, “suggesting new users are not only data engineers but data analysts,” the company said.

Meanwhile, Spark usage among Windows users also increased by 9 percent over the previous year to 32 percent of those surveyed, Databricks reported. “These attributes make Spark an attractive engine for performing advanced analytics across industry verticals in solving complex data problems, by users from different functional roles,” Reynold Xin, Databricks’ chief architect and co-founder, noted in a statement releasing the survey results.

Recent items:

Inflexible Data, Analytics Fueling Failures, Survey Finds

Is Spark Overhyped?

Applications: Artificial Intelligence

Sectors: Financial Services, Other, Retail

Vendors: Databricks

Tags: Apache MLlib, apache spark, data streaming, machine learning, sql

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Survey: Spark Going ‘Mainstream’

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 18, 2024

April 17, 2024

April 16, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Survey: Spark Going ‘Mainstream’

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 18, 2024

April 17, 2024

April 16, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link