Follow Datanami:
August 22, 2017

Databricks, Flush With Cash, Steers Spark at AI

(ami mataraj/Shutterstock)

Momentum around the Apache Spark cluster computing framework continues to build with the announcement of hefty late-stage funding round that will help push the analytics platform and related artificial intelligence applications deeper into enterprises.

San Francisco-based Databricks, the startup formed by the team that created Spark, announced this week it has secured $140 million in venture funding. Silicon Valley powerhouse Andreessen Horowitz led the Series D round along with New Enterprise Associates and Battery Ventures.

The cash infusion illustrates how Databricks and other Spark proponents are positioning the analytics platform as a bridge from data science to an enterprise-wide AI tool. The company said it would use the funding to advance its Spark-based Unified Analytics Platform.

The company also said Tuesday (Aug. 22) it would seek to extend the reach of its cloud-based analytics platform deeper in the financial services, government, health care and media sectors.

Databricks and its investors are targeting an emerging global AI market estimated to be worth nearly $37 billion by 2025. “While almost every business is exploring how they can use artificial intelligence for competitive advantage, very few are able to do so effectively today,” the company noted in a statement announcing the results of its latest funding round.

Among the goals are making AI and data science more “approachable” to data-driven enterprises, the company added.

The huge funding round underscores Spark’s growing popularity among customers looking for ways to leverage AI tools along with real-time data streaming capabilities, which Databricks CEO and co-founder Ali Ghodsi insists are “taking off.”

Earlier this year, Databricks rolled out a new version of its cloud platform based on Spark that specifically targets data engineering workloads. The company asserts its data science platform would enable data engineers to combine SQL, structured streaming, ETL and machine learning workloads running on the cluster-computing framework. Among the goals is to accelerate secure deployment of data pipelines in production.

Meanwhile, the startup has been steadily pushing the underlying goal of Spark: “democratizing” big data via high-level APIs along with an engine that combines machine learning and ETL along with interactive and streaming SQL.

To that end, the company unveiled a new open-source library in June called Deep Learning Pipelines designed to ease the integration of deep learning into workloads ranging from machine learning to business analytics.

“We believe Deep Learning Pipelines has the potential to accomplish what Spark did to big data: make the deep learning ‘superpower’ approachable for everybody,” the company asserted in a blog post.

Recent items:

Databricks Eyes Data Engineers With Spark Cloud

Databricks CEO on Streaming Analytics, Deep Learning, and SQL