September 25, 2015

MemSQL Adds Spark Pipeline

George Leopold

A Spark “Streamliner” introduced this week by in-memory database vendor MemSQL aims to provide Spark users quick access to real-time analytics and transactions.

San Francisco-based MemSQL said Wednesday (Sept. 24) its new platform addresses the growing enterprise need to “synthesize varied data types,” including historical data. Hence, its real-time data pipeline between Apache Spark and MemSQL is intended as an easier way to deploy multiple pipelines as a way to keep up with dynamic data flows.

The Streamliner tool is designed as a single-click deployment of integrated Apache Spark “to eliminate the pain of batch ETL,” the company said. A web-based user interface is designed to allow for multiple real-time data pipelines.

The tool also capitalizes on Apache Spark’s inroads in the enterprise. In June, for example, IBM said it would integrate the open source in-memory processing framework into the “core” of its analytics and commerce platforms. It also said it would work closely with Databricks, the company formed by the creators of the analytics engine.

Databricks released Apache Spark 1.4 in June.

Eric Frenkiel, CEO at MemSQL, said the company’s Spark integration would allow enterprise to move beyond “many narrow purpose solutions to fewer multi-purpose solutions.” Frankiel added in a statement: “Our vision is to operationalize Spark for a wide range of use cases so customers and partners can easily take advantage of the data processing framework available in Spark and spend their time gaining actionable insights from data.”

The ability to deploy and manage multiple real-time pipelines with a single interface and shared resource pool is expected to benefit applications ranging from trading analytics and cyber-security to omnichannel retail and Internet of Things use cases, the company said.

MemSQL said its Spark Streamliner could support thousands of simultaneous users running real-time analytics queries. The platform also is touted as reducing latency to stream data directly into the MemSQL database across memory-based row store or disk-based columnar store.

MemSQL is among an emerging class of in-memory relational databases gaining momentum for their capability to ingest and analyze large amounts of data in near real time. Unveiled last year, MemSQL 3.0, added a new flash-based columnar store designed to add storage and analysis of historical data.

The in-memory database is intended to address a familiar problem: organizations have previously relied on big data warehouses or Hadoop to crunch large volumes of historical data and to create data models. After being created by a Hadoop cluster or a Teradata warehouse, these data models are then used by operational systems, such as NoSQL databases, to make real time decisions.

However, simply moving data through batch ETL and CDC processes can take many hours, if not days. Moreover, data models contain older data that could translate into missed opportunities. Hence, MemSQL integrates both pieces of the data analytics puzzle–the data model that informs analytic decision making and the operational data store that acts on those decisions–in the same place.

The company said this week its Streamline tool could integrate Apache Spark to provide immediate access to real-time analytics. MemSQL said its Spark Streamliner is available as open source on GitHub. The open source approach is intended to spur development of applications based on real-time data and easy access via transactional SQL.

Recent items:

IBM, Databricks Join Forces to Advance Spark

Put a Data Warehouse In Your Operational Data Store, MemSQL Says

Applications: Predictive Analytics

Technologies: Frameworks

Sectors: Financial Services, Retail

Tags: apache spark, memsql, NoSQL databases

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

MemSQL Adds Spark Pipeline

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

April 17, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

MemSQL Adds Spark Pipeline

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

April 17, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link