January 17, 2014

Zooming Through Historical Data with Streaming Micro Queries

Alex Woodie

Stream processing engines, such as Storm and S4, are commonly used to analyze real-time data as it flows into an organization. But did you know you can use this technology to analyze historical data too? A company called ZoomData recently showed how.

In a recent YouTube presentation, Zoomdata Justin Langseth demonstrated his company’s technology, which combines open source stream processing engines like Apache with data connection and visualization libraries based on D3.js.

“We’re doing data analytics and visualization a little differently than it’s traditionally done,” Langseth says in the video. “Legacy BI tools will generate a big SQL statement, run it against Oracle or Teradata, then wait for two to 20 to 200 seconds before showing it to the user. We use a different approach based on the Storm stream processing engine.”

Once hooked up to a data source–such as Cloudera Impala or Amazon Redshift–data is then fed into the Zoomdata platform, which performs calculations against the data as it flows in, “kind of like continues event processing but geared more toward analytics,” Langseth says.

“We use that for real time data but also for historical,” he continues. “Instead of launching big queries and waiting for results, we run streams of little tiny queries against historical data and process the results of those micro queries, as we call them, also through the stream processing engine. That allows us to very quickly visualize very large sets of data, and do it almost instantaneously.”

The Zoomdata “Time Bar” feature provides a DVR-like interface that allows users to zoom forward and backward through time to see how various properties change. “You can see the historical data as run through the stream processing engine,” he says. “When you get back to the live point, it stops fast forwarding.”

Using micro queries allows Langseth and company to keep everything working fast and intuitively despite the large data sets measured in billions of records. “As the micro queries come in, it allows us to draw an estimated picture of the data within a second or so, and then as more micro queries come in, we sharpen, if you will, the result display,” he says.

This is similar to the way YouToube works. “If you start playing a You Tube video, it starts out kind of fuzzy, but it’s watchable and it starts immediately instead of buffering for 30 seconds like in the old days,” Langseth says. “So we’re doing the same thing for the historical micro queries against the big data. As you watch it, it gets sharper and sharper.”

Zoomdata, based in Reston, Virginia, received a U.S. patent this week related to its real-time data visualization technology. The company received $4.1 million in venture funding last year.

Datanami Dishes on ‘Big Data’ Predictions for 2014

Data Scientists–Who Needs Them Anyway?

Applications: Data Mining, Visualization

Technologies: Middleware

Sectors: Financial Services, Retail

Vendors: Startups and More...

Tags: impala, s4, storm

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Zooming Through Historical Data with Streaming Micro Queries

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 18, 2024

April 17, 2024

April 16, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Zooming Through Historical Data with Streaming Micro Queries

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 18, 2024

April 17, 2024

April 16, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link