Follow Datanami:
August 13, 2024

Slicing and Dicing the Real-Time Analytics Database Market

(Tee11/Shutterstock)

Organizations that are in the market for analytics databases that can serve an enormous quantity of queries on massive sets of fast-changing data may want to check out the latest Gigaom Sonar report on real-time analytics databases.

Real-time analytics databases are a relatively new product category that has emerged over the past few years to serve the most demanding analytics workloads. The offerings in this sector combine existing technological capabilities, like OLAP and streaming data, in new ways to address novel data processing challenges at massive scale.

The new Gigaom Sonar report from Andrew Brust, the analyst group’s longtime research director, covers the emerging market and its biggest players, including Aerospike, ClickHouse, Imply, Kinetica, Materialize, MotherDuck, SingleStore, StarRocks, and StarTree. (Brust undoubtedly also would have included Rockset if it hadn’t been acquired by OpenAI in June.)

Brust notes that real-time analytics databases don’t represent a revolutionary new type of technology but rather an evolution of existing ones.

“These databases have their roots in traditional online analytical processing (OLAP) databases; however, they surpass these predecessors by providing the ability to connect to and ingest extremely large (up to petabyte-scale) volumes of data, often from streaming data sources and batch or change data capture (CDC) sources,” he notes in the report.

Real-time analytics database share traits with other database types (Image courtesy GigaOm)

“To facilitate analytics over large volumes of data with minimal latency, the databases in this category make use of structural and architectural optimizations,” he continues. “Examples include columnar orientation, various types of indexing, partitioning, and segmentation, precomputations of aggregations to accelerate queries, and vector processing. Scalability—the resilience of the system under the demands of increasing workloads–and high availability are also important in this category because of the time-critical nature of the analysis.”

Some of the biggest and toughest big data workloads in existence today are running atop real-time analytics databases, such as the hundreds of millions of daily auctions run by ad-tech firm Sovrn (a StarTree customer) and the 1.5 billion events processed daily by Cisco ThousandEyes (an Imply customer), in addition to other use cases at Uber, Target, and Netflix. Many of the gnarliest real-time analytics use cases involve consumer-facing Web applications, thanks to the unique combination of data scale, data freshness, query throughput, and query latency demands that billions of consumers can bring.

Brust rated the nine vendors across seven characteristics he deemed most important for real-time analytics databases, including: storage/analytics optimizations; data ingestion; analytics preprocessing; schema management; client/tool connectivity; scalability; and high availability.

The result was a five-way tie for first place among ClickHouse, Imply, Kinetica, StarRocks, and StarTree, each of which had an average score of 2.6 stars (out of three). SingleStore occupied sixth place with a score of 2.4, MotherDuck came in seventh with a score of 2.3, and Aerospike and Materialize tied for eighth place with scores of 2.1.

GigaOm Sonar report on real-time analytics databases (Image courtesy GigaOm)

Brust notes that all of the vendors included in the Sonar report are considered innovators, and that they’re all data platform builders, rather than just providers of features. All of the products are “comprehensive, well-rounded” offerings, he writes.

“In reviewing solutions, it’s important to keep in mind that there are no universal ‘best’ or ‘worst offerings,” the longtime big data analyst and ZDNet contributor writes. “There are aspects of every solution that might make it a better or worse fit for specific customer requirements. Prospective customers should consider their current and future needs when comparing solutions and vendor roadmaps.”

Kishore Gopalakrishna, the co-founder and CEO of StarTree, the commercial outfit behind the open source Apache Pinot project, heralded the findings of the report.

“This acknowledgment reflects our unwavering commitment to innovation and delivering cutting-edge solutions that empower our customers to unlock the full potential of their data,” Gopalakrishna said in press release. “We will continue to push the boundaries of what’s possible in real-time analytics.”

You can access a reprint of the GigaOm Sonar Report here.

Related Items:

Real-Time Analytics Databases Emerge to Take On Big, Fast-Moving Data

Yes, Real-Time Streaming Data Is Still Growing

StarTree Finds Apache Pinot the Right Vintage for IT Observability

Datanami