Follow Datanami:
March 20, 2019

Rockset Cranks Up Serverless Analytics Engine

Cloud-based serverless data preparation tools are gaining traction among data analysts seeking to move beyond traditional approaches to organizing low-quality but potentially valuable data. That approach promises to provide data scientists with an alternative to standard ETL routines developed to plumb data warehouses.

The serverless analytics segment is evolving with new search and analytics tools that aim to reduce the time needed to organize data for analysis reduced from days to hours. The latest serverless platform comes from startup Rockset, which this week unveiled a serverless cloud service it claims would free data scientists from managing ETL pipelines to organize data sets in “minutes.”

Rockset notes that emerging digital applications are driving the requirement for “clean data” from a growing list of sources. “However, businesses are struggling with tons of high-value, low-quality data in fragmented systems like data lakes, NoSQL databases and data streams,” the startup emphasized in releasing its platform on Tuesday (March 19).

Moreover, the mish-mash of data originates from different sources, most of it is machine data in either JSON, CSV, XML and Parquet formats or business data in XLSX and PDF formats.

Rockset’s serverless backend is designed to continuously ingest this raw data as it is generated with the goal of delivering real-time SQL queries in a framework that can scale. The scheme is built around Rockset proprietary “Converged Indexing” technology which combines index, columnar and document indices. The approach is tuned to key-value, time-series, graph-type and other queries “out of the box,” Rockset claims.

The serverless platform runs on top of the company’s database and its cloud-native “distributed query engine” to handle interactive data analytics and real-time applications. Schema-less data can be pulled in from databases, data lakes and streams ranging from Apache Kafka to Amazon Web Services’ S3 and Google Cloud Storage.

The cloud service is available now.

Rockset emerged from stealth mode this past November with the announcement of a $21.5 million funding round. The startup’s co-founders, Venkat Venkataramani and Dhruba Borthakur, helped build Facebook’s (NASDAQ: FB) online data and search infrastructure. Other Rockset engineers are credited with helping create the Hadoop Filesystem at Yahoo, implementing the Gmail backend infrastructure at Google (NASDAQ: GOOGL) as well as databases at Oracle (NYSE: ORCL).

The startup, based in San Mateo, Calif., was founded in 2016 on the premise that traditional SQL databases can’t handle the scale of streaming data. Meanwhile, NoSQL systems fall short in supporting complex queries. That has forced developers to use different tools for storage, processing and delivering data.

Recent items:

Rockset, SQL Cloud Service, Emerges From Stealth

Data Prep Goes Serverless