Follow Datanami:
September 19, 2013

MemSQL Bolsters Semi-Structured Chops of Its RDBMS

Alex Woodie

One of the hot areas in data analytics these days is fusing the worlds of structured and semi-structured data. There are many ways to analyze one or the other, but few platforms that do both well. MemSQL, a part of the NewSQL movement, this week made headway in its capability to store and analyze fast moving, semi-structured data by adding support for JSON data types to its SQL-based RDBMs.

MemSQL was founded in 2011 by former Facebook engineers Eric Frenkiel and Nikita Shamgunov, and came to market in 2012 with the first release of its distributed, in-memory, relational database. The software is designed to store and process big, fast, and semi-structured data–think sensor and machine data–in real time, from the comfort of SQL.

Keeping the database’s interface in SQL, the thinking goes, eliminates the need for applications and developers to become experts in things like Memcache or to use true NoSQL databases. This approach–which MemSQL accomplishes by converting SQL queries into C++ code that is then stored as a shared object–allows MemSQL to support new data types that legacy SQL databases like DB2 and MySQL struggle with, but without adding layers of complex new technologies or middleware.

The addition of Java Script Object Notation (JSON) support in the upcoming release of MemSQL version 2.5 will make its semi-structured data management story even more compelling. Thanks to its adoption by Facebook, Twitter, and other Web 2.0 properties, JSON has become a defacto standard syntax for storing and exchanging semi-structured data. The addition of native JSON support will enable MemSQL to store and process click stream and event feed data from the Web.

According to MemSQL, existing NoSQL databases have the capability to handle JSON data, but they either don’t do it in real-time (the data must first be transformed) or they don’t do it scale (MemSQL claims to scale across thousands of nodes and hundreds of terabytes).

This screenshot shows how JSON data can be accessed through SQL.

That’s a dig at MongoDB, one of the bigger NoSQL vendors, and one that supports JSON data. Vendors of traditional SQL-based RDBMs have it even worse, since SQL doesn’t natively provide a way to read JSON. Legacy SQL database vendors, like IBM and Oracle’s MySQL, are adding support for JSON in their databases, but it’s a work in progress.

MemSQL hopes that its addition of JSON support will fill the gap left by traditional RDMBs and NoSQL database vendors. The company has caught the eye of John Myers, senior analyst at Enterprise Management Associates, an IT analyst firm based in Boulder, Colorado, who said that the addition of “true SQL-based access to JSON promises to bring analytical response to multi-structured datasets in a way that will be difficult to match.”

The San Francisco-based company is one of the up-and-coming NewSQL vendors to keep an eye on. The company is a 2012 graduate of Y Combinator, a seed accelerator program that brings dozens of promising tech companies to Silicon Valley to whip their technologies and VC pitches into shape. Since then, it’s gone through two rounds of investments, totaling $5.1 million.

MemSQL reportedly has been adopted by dozens of  companies, including Zynga, Morgan Stanley, Shutterstock, Comcast, CPX Interactive, and Ziff Davis. At Shutterstock, the online picture sharing outfit is using MemSQL to collect, analyze, and interact with more than 800,000 data points, according to Chris Fischer, the company’s vice president of technology operations.

MemSQL is one of dozens of NoSQL and NewSQL technology startups that are competing to be the engines behind next-generation transactional and analytical systems. These next-gen systems will be characterized by the need to process a variety of data types–from unstructured data like pictures to semi-structured data like JSON to traditional SQL-compliant data–and to do so on-the-fly, for analytical and transactional workloads, and in a secure, compliant, and scalable manner.

It’s extremely doubtful that any one database will be able to accomplish all of that. More than likely, organizations will find a way to cobble databases together to achieve certain ends, much like MySQL supports multiple engines to support different types of workloads. It’s worth noting that MemSQL, in addition to being ACID compliant, is wire-compliant with MySQL, which means that any application designed to run atop MySQL will be able to run against MemSQL without much change.

Version 2.5 of the database is due out in October. Other notable features include support for an online alter table command that will allow customers to add new columns or tables to the database without downtime.

Related Items:

MongoDB Levels Up with a Dedicated Conference

Couchbase Finds $25M in the Cushions

AeroSpike Says Secret to NoSQL Speed is Simplicity

Datanami