Follow Datanami:
December 14, 2016

MariaDB Takes On Teradata, Vertica with Column Store

Look out, Teradata and HPE Vertica: There’s a new MPP player in town. MariaDB today announced that it’s added a column store to its popular relational database, thereby enabling it to efficiently run a whole new class of petabyte-scale analytical queries that had previously been the domain of propriety offerings.

Column store databases are more efficient at analytical queries because of the way they store data. By organizing the data in a column orientation–as opposed to the standard row-based format typically found in databased used to power transactional systems–column-stores deliver blazing fast results for common analytical queries, such as aggregations, that would require an inordinate number of disk-seeks in a row-based orientation.

Over the past 15 years, the most popular column-store databases have been proprietary offerings. Also called massively parallel processing (MPP), analytical systems from the like of Teradata (NYSE: TDC), HPE (soon to be Micro Focus) Vertica, IBM (NYSE: IBM) Netezza (now called Puredata System Analytics), Greenplum, and ParAccel commanded big license fees and large teams to deploy and run.

Now MariaDB, the open source successor to the massively popular MySQL database that’s now owned by Oracle (NYSE: ORCL), is hoping to shake up the column-store MPP database market with the launch of ColumnStore 1.0.

David Thompson, vice president of ColumnStore engineering at MariaDB, says that while ColumnStore won’t be free, it is open source. And with a starting price of $9,000 per node, it should attract its share of customers.

“I would view this is competitive with the likes of Teradata and Vertica,” he tells Datanami. “One of the big differences with us is this is an open source offering. And it’s going to cost you about 90% percent less than a Teradata deployment.”

While there may be some Teradata or Vertica customers who seek to migrate from those systems, the bigger opportunity may be lifting smaller customers up from single-server data marts, and to allow them to take advantage of the parallel processing and automatic data sharding capabilities that are built into the new offering.

“There’s a lot of people who built up small datamarts on MySQL or MariaDB server, and then they sort of outgrow that,” Thompson says. “The column store is great because they can just move the data to a new storage engine, and their same queries work, but they work a lot faster.”

Thompson says the MariaDB ColumnStore supports the full breadth of the ANSI SQL specification as it relates to analytical commands. That puts the software on the same plane as the mature MPP databases from Teradata, HPE Vertica, and the others. But it elevates them above what you can get with a SQL-on-Hadoop offering, such as Hive or Impala, Thompson says.

“If you compare us to Hadoop, we provide a lot of the same simplified scale out, but we give you ANSI SQL,” he says. “You can use the query language you’re used to and the tools you’re used to.”

The column store—which is an upgraded version of the InfiniDB engine that was originally developed by the now-defunct company Calpont for old MySQL version 5 databases–can be added to any MariaDB implementation, and exist side-by-side with the regular row-based storage engine, Thompson says. “We’re looking at how do we make the transition between OLTP and analytics more seamless,” he says.

Architecturally, MariaDB is trying to present best-of-breed storage engines within the MariaDB server in a simple manner. “If you want to do OLTP, then use InnoDB. For analytics, you have ColumnStore,” Thompson says. “The system allows you to mix queries across different storage engines. So you can have your big terabyte- or petabyte-scale fact table in ColumnStore, but still reference the transactional look-up tables that are stored in InnoDB.”

MariaDB ColumnStore is available now. The company is also offering a professional services engagement called MariaDB ColumnStore JumpStart that can deliver a ready-to-use ColumnStore environment in three to five days.

Related Items:

Where Does InfiniDB Go From Here?

Hadoop and NoSQL Now Data Warehouse-Worthy: Gartner

 

Datanami