February 2, 2016

Redis Connector Aims to Boost Spark Performance

Name: CDAO Government 2024
Start: 2024-09-18T00:00:00-07:00
End: 2024-09-19T23:59:59-07:00
Location: Convene

George Leopold

Adoption of the Apache Spark big data framework continues to build momentum with the release of NoSQL leader Redis Labs’ integration with Spark SQL along with the release of its Spark connector package.

Redis Labs said Tuesday (Feb. 2) its Spark connector package is being released as an open source option that includes a library for writing to and reading from a Redis cluster. The release includes access to all data structures from Spark as a resilient distributed dataset (RDD) API. Redis, Mountain View, Calif., also said its connector package provides closer alignment between Spark and Redis clusters, which is intended to reduce network overhead while improving processing performance.

The company claimed its performance benchmark based on time-series data revealed that Spark running on Redis as a data store yielded processing speeds as much as 135 times faster than Spark using the Hadoop Distributed File System. Redis said Spark ran as much as 45 times faster on its platform than the Tachyon in-memory file system with Spark storing data in an on-heap data structure.

Redis said other advantages of using Spark with its platform include a more than 100-fold increase in Spark performance in applications such as Spark time-series used to gather a large sequence of measurements over time. The company also said its data structures allow data elements to be accessed individually, thereby reducing serialization/deserialization overhead. That feature also reduces requirements for transferring large data batches.

Yiftach Shoolman, cofounder and CTO of Redis Labs, emphasized that the company’s Apache Spark connector addresses the growing demand to extract big data insights in real time. Hence, the company focused on fine-tuning its distributed memory capabilities to accelerate Spark performance.

“Our goal is to make Redis the de-facto data store for any Spark deployment,” Shoolman noted in a statement.

Hence, the Redis cluster can be used as a distributed memory infrastructure for Spark. The company also said the combination would enable its data structures when exposed via Spark RDD and the DataSet API. Databricks Inc., which was founded by the creators of Spark, included the DataSet API in the 1.6 release of Spark.

San Francisco-based Databricks said it worked closely with Redis Labs to develop the connector package with the goal of delivering real-time analytics.

Redis also said its integration with Spark would enable Spark SQL support via the DataFrame and DataSet APIs as a standard query interface.

Future enhancements to the Spark-Redis connector package include using it new use cases such as graph computation and machine learning, Redit added.

Meanwhile, the Apache Spark community gathers in New York City from Feb. 16-18 to convene a summit focusing on advances in the open source processing engine, Spark SQL and Spark streaming.

Recent items:

Lifting the Fog of Spark Adoption

3 Major Things You Should Know About Apache Spark 1.6

Spark Streaming: What Is It and Who’s Using It?

Applications: Enterprise Analytics

Technologies: Frameworks

Sectors: Financial Services, Other, Retail

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Redis Connector Aims to Boost Spark Performance

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 24, 2024

April 23, 2024

April 22, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Redis Connector Aims to Boost Spark Performance

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 24, 2024

April 23, 2024

April 22, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link