Aerospike’s Presto Connector Goes Live
Aerospike’s connector to the Presto SQL query engine is transitioning from beta to general availability, the real-time NoSQL database vendor said this week.
Presto, the highly parallel and distributed SQL query engine, was originally developed at Facebook (NASDAQ: FB) as the follow-on to Apache Hive. Aerospike said Thursday (Jan. 14) its Presto link would allow data analysts to use the standard ANSI SQL to query data stored in its database via Presto.
Aerospike, Mountain View, Calif., also said the multi-tenant platform supports hundreds of concurrent queries requiring substantial memory, I/O and CPU resources.
The Presto connector is a Java application, which according to the company is distributed as a “bundle of jars.” The configuration enables ANSI SQL queries “in-place” on huge data sets. The connector eliminates the need to copy data to other analytics platforms, a feature that is promoted as supporting data governance and regulatory compliance.
The connector also supports Presto data types, including arrays and maps as well as read/write SQL statements.
“Interactive queries demand near real-time response times,” the company noted in a blog post. “No matter how optimized [an] SQL query engine is, most of the performance gains are wiped out by slow reads from the underlying database.
“Slower queries can lead to loss of productivity, which can be very pronounced for large enterprises where millions of queries are run on a daily basis,” it added.
The Presto connector also enables federated queries across multiple Aerospike clusters or between other databases and Aerospike, a capability that allows users to connect tables from different data sources. That feature also reflects typical enterprise deployments of different databases. The connector can plug an Aerospike cluster into diverse database deployments, including Apache Cassandra, PostgreSQL and others.
Aerospike’s NoSQL database is schema-less. The Presto connector reconciles differences in schema, a feature designed to provide users a familiar SQL experience while at the same time leveraging the advantages of a NoSQL platform.
In releasing the beta version of the Presto connector last fall, Aerospike noted “the industry has a lot of use cases for running SQL queries on top of NoSQL big data clusters for ad-hoc analytics, BI [and others]. We looked for open-source and durable software to give us an ability to speak the language of relational databases.”
As for optimizing query performance, Aerospike said a “coordinator” queries a connector for a list of splits available for a given table.
“The coordinator keeps track of which machines are running which tasks, and what splits are being processed by which tasks,” the company said.