Databricks, Talend Expand Cloud Access to Spark
Databricks and Talend, the cloud data integration vendor, are joining forces to help data jockeys scale their integration efforts using the Apache Spark analytics engine hosted on Talend’s cloud.
Databricks, the creators of Apache Spark, and Talend (NASDAQ: TLND) said integration between the cloud service and Databrick’s analytics platform would enable data engineers to leverage the cluster computing framework for processing large data sets at scale. The integration would replace manual coding with a drop-and-drag interface, the partners said Thursday (Nov. 15).
Talend’s cloud is already integrated with the Databricks analytics engine running on Amazon Web Services (NASDAQ: AMZN) and Microsoft Azure (NASDAQ: MSFT). The integration would allow joint customers to move more analytics workloads to Databricks via Talend’s cloud, according to Michael Pickett, Talend’s senior vice president for corporate and business development.
Databricks pitches its cloud-based analytics platform as running workloads ranging from data pipelines to machine learning models “in one place.” The cloud-native service also includes features like automatic configuration and scaling.
“Our joint integration provides a single source of trusted data which is essential when data teams are working with machine learning algorithms,” said Michael Hoff, senior vice president for business development at San Francisco-based Databricks.
Talend, Redwood City, Calif., said its native support for the Databricks analytics platform would allow users to spin up and down a big data cluster in the cloud, then ingest and process large data volumes, paying only for those cloud resources used.
The partnership with Databricks represents Taland’s latest foray into providing cloud services to data scientists and analysts. Earlier this year, it launched a free data streaming service based on Apache Beam to build data pipelines on the AWS cloud.
The shift to real-time data streaming expands on its earlier work on batch-oriented ETL. The streaming service builds data pipelines in either batch or streaming modes.