November 19, 2018

Databricks Upgrades Spark Support, Adds ML Runtime

George Leopold

via Shutterstock

Databricks announced support this week for the latest version of Spark, integrating it into its enterprise analytics platform. Along with support for version 2.4 of the stream processing framework integrated as part of Databricks’ latest runtime, the company also this week unveiled a new runtime feature aimed at simplifying deep learning.

The Spark 2.4 release unveiled earlier this month includes upgrades that improve the performance of distributed deep learning and machine learning framework running on Spark. Databricks noted that the upgraded version of its analytics platform running on Spark 2.4 includes improvements that address dependencies associated with deep learning tasks.

The Spark upgrades were consolidated in an effort called Project Hydrogen that introduced a new scheduling mode called “barrier execution.” The tool allows developers to embed training for distributed deep learning as an Apache Spark workload, San Francisco-based Databricks said.

“This is the largest change to Spark’s scheduler since the inception of the project,” said Reynold Xin, co-founder at Databricks and a Spark contributor. Xin added that the upgrades would help reduce the complexity of machine learning workloads.

The new runtime feature dubbed HorovodRunner is designed to simplify scaling of distributed deep learning workloads from a single machine to large clusters. Previously, migrating from single-node workloads to distributed training on CPU or GPU clusters required full code rewrites, the company said. HorovodRunner would reduce programming and training time from hours to minutes, Databricks claims.

Along with Horovod, the distributed training framework, Databricks said its platform provides native integrations with Kera, TensorFlow and other machine learning schemes along with MLlib and GraphFrames machine learning algorithms.

Last week, Databricks announced a partnership with cloud data integrator Talend (NASDAQ: TLND) to combine the cloud service with Databrick’s analytics platform to enable data engineers to leverage the cluster computing framework for processing large data sets at scale.

Recent items:

Databricks, Talend Expand Cloud Access to Spark

What’s in the Pipeline for Apache Spark?

Applications: Artificial Intelligence

Technologies: Cloud, Frameworks

Sectors: Financial Services, Healthcare, Manufacturing, Other, Retail

Vendors: Databricks, Talend

Tags: apache spark, deep learning, Horovod, scheduler, TensorFlow

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Databricks Upgrades Spark Support, Adds ML Runtime

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Databricks Upgrades Spark Support, Adds ML Runtime

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link