Jethro Indexer Adds ‘Auto-Cubes’
The new version of a SQL query engine for Hadoop scheduled for release later this month adds “auto-cubes,” a type of multi-dimensional dataset that, in this implementation, consists of aggregated “micro-cubes” that are generated based on usage patterns learned by the SQL accelerator.
New York-based Jethro (formerly JethroData) said it plans to release its 2.0 version during Stata+Hadoop later this month. Jethro’s platform is a combination of two engines: columnar SQL database and search indexing.
The startup said the new auto-cubes feature complements its full-indexing and intelligent caching approaches designed to accelerate SQL query performance in business intelligence and dashboards. Auto-cubes “are generated based on user activity and are maintained and updated transparently,” Jethro CTO Boaz Raufman noted in a statement.
The startup further claims that dynamically aggregated micro-cubes reduce the need for complex data cube designs while boosting coverage via “hundreds” of small cubes while supporting incremental data loads. The combination of dynamic aggregation of auto-cubes and indexing is intended to further accelerate SQL queries to handle a broader range of big data use cases. It also provides interactive response times for business intelligence applications in more user scenarios, Raufman added.
Along with auto-cubes, Jethro 2.0 also boosts support for Qlik View and Sense along with broader SQL coverage with expanded math functions. The point is to make big data analytics work in real time.
The startup differentiates its acceleration engine technology by allowing users to keep data on Hadoop while retaining the performance of an electronic data warehouse engine. The engine is “sandwiched” between a BI tool and existing data sources. Jethro is intended to accelerate BI tool reporting and visualizations without overtaxing a Hadoop cluster.
Jethro essentially takes a column-oriented database (like Vertica or Impala) and combines it with a search engine indexing tool. The resulting columnar-based database is fully indexed, where each additional column of data is treated as its own index.
The index-based SQL engine for Hadoop seeks to enable organizations to use their BI tools with large datasets while maintaining interactive speed. It works by fully indexing select datasets in Hadoop. BI queries use indexes to access only the data they need instead of scanning an entire dataset. The result is supposed to be increased speed and less stress on computing resources.
Jethro has scheduled a webinar for Sept. 15 to demonstrate how new “auto microcubes” work with indexing and smart caching to accelerate interactive business intelligence applications.
The startup has so far raised $12.6 million in two funding rounds, including an $8.1 million Series B funding round in March 2015.