June 6, 2016

MapR Unveils Spark-Only Distro

Alex Woodie

Big data practitioners who want to get started quickly with Apache Spark but don’t want to mess around with Hadoop may be interested in new software that MapR Technologies announced today.

MapR’s new Apache Spark Distribution provides the complete Spark stack, enabling developers to begin building Spark apps that utilize various APIs for batch and stream processing, graph analytics, and SQL.

While the software uses YARN as a resource scheduler and MapR’s file system (which borrows from HDFS), there aren’t any other Hadoop components in the distribution. Customers can add standard Hadoop features like MapReduce, Hive, and Pig if they want–as well as proprietary MapR add-ons like MapR-DB and Map-Streams–but they don’t have to.

MapR calls it a “Spark-focused distribution,” and it’s no coincidence that it’s being unveiled on the first day of Databricks‘ Spark Summit event that’s taking place in San Francisco.

“Previously Spark was bundled with Hadoop and optionally converged with all the other options of the MapR platform (NoSQL, etc.),” says Jack Norris, senior vice president of data and applications at MapR. “We’ve seen a lot of interest in Spark and many developers and organizations are starting with Spark directly. So with this Spark Distribution it allows organizations that just want Spark to have a dedicated distro with the integrated data platform.”

Norris says MapR’s Apache Spark Distribution gives big data developers and analysts the Spark functions they need, without forcing them to make compromises.

“If you are looking to have a large scale distributed data store with Spark,” he tells Datanami, “you have to compromise with a platform geared to batch (HDFS) or a NoSQL with eventual consistency (Cassandra).”

MapR will also leverage its Spark Distribution in its Quick Start Solution offerings, which include pre-built templates, configuration and installation. The most popular use cases for Spark include building data pipelines and developing advanced analytical applications leveraging machine learning.

MapR says it’s seen “significant growth” of customers who are deploying Spark as their primary compute engine. That backs up research by Enterprise Strategy Group that shows 16% of businesses have already deployed Spark to production and that another 47% are “very interested” in implementing Spark. “As such, Spark will power the next wave of big data,” says senior ESG analyst Nik Rouda in MapR’s press release.

The launch of a dedicated Spark distribution is the strongest commitment to Spark among what was once the three “Hadoop” distributors–MapR, Cloudera, and Hortonworks. All three companies include Spark in their distributions, but MapR is the only one who has taken most of the Hadoop components out.

This approach will make it easier for customers to get started with Hadoop, says Anoop Dawar, vice president product management for MapR. “We believe this gives our customers a converged compute and storage engine for batch, analytics, and real-time processing that helps build and deploy applications rapidly,” he says in a statement.

Spark Takes On Dataflow in Benchmark Test

Applications: Enterprise Analytics

Technologies: Frameworks

Sectors: Retail

Vendors: Cloudera, Hortonworks, MapR

Tags: apache hadoop, apache spark, mapr

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

MapR Unveils Spark-Only Distro

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 18, 2024

April 17, 2024

April 16, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

MapR Unveils Spark-Only Distro

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 18, 2024

April 17, 2024

April 16, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link