Three Ways Zoomdata Makes Big Data Pop
When it comes to big data visualization tools, there’s no shortage of players. Tableau, Qlik, Spotfire, and Microstrategy are established incumbents with big followings. But there’s a fresh crop of visualization tools making waves, including one from Zoomdata that’s helping to change how we think about big data analysis.
Here are three ways that Zoomdata is helping to change the field of big data analytics and visualization:
1. Micro Queries and Data Sharpening
Zoomdata’s flagship analytics and visualization tool embraces the concept of “micro queries.” Instead of composing a query, hitting the “go” button, and waiting for it to return, the software begins returning results almost immediately.
This approach is useful in enabling data analysts and scientists to query large data sets—measuring in the billions of rows–that would otherwise require waiting long periods for queries to return, which can stifle innovation and slow discovery. Users get the sensation of “zooming” through data (including real-time data and historical data sets) via the product’s Web-based interface.
The intellectual property behind the micro queries forms the heart of Zoomdata’s value proposition. “We have a very sophisticated query optimizer and we’ve received patents in our data optimization and data sharpening capabilities,” says Zoomdata CMO Nick Halsey. “We have a customer in insurance, and their users are interacting with 9 billion rows of data, and have a six-second response time.”
In the Hadoop world, where waits of 10 minutes or more are not uncommon against data sets of a billion rows or more, that kind of response time would be considered very fast. “But our team is frustrated,” Halsey says. “They think six seconds is not fast enough.”
According to Halsey, the micro queries also provide a big hook to land big fish from Zoomdata’s top competitor, Tableau. “A lot of our prospects come to us because they are Tableau users,” says Halsey, who was on Tableau’s advisory board when Christian Chabot founded the company. “As they scale up a big data or real-time application and they try and use those tools, they just don’t work very well. You get to a certain data size and they choke.”
2. Leave the Data Where It Is
One of the definitions of “big data” is it’s too big to move. To that end, one of the hallmarks of Zoomdatas’ approach is leaving the data where it is, and pushing queries down to the underlying data store where it’s possible.
Zoomdata builds specialized connectors that push analytic processing down to each supported data store. The company supports Hadoop-based SQL store like Impala or Hive; NoSQL stores like MongoDB; NewSQL stores like MemSQL; relational databases like Oracle and SQL Server; cloud stores like Redshift and Salesforce.com; search engines like Solr and ElasticSearch; and streaming data sources like Twitter, Kinesis, and InforSphere Streams (currently in development).
“The bulk of the work is happening in the big data source,” Zoomdata product manager Scott Cappiello tells Datanami. “You really have to treat each of those as something special, because they are. There is something special and a reason why you chose that data framework. It fits the data. So you need a visualization layer that’s going to take advantage of what’s great about each of those frameworks.”
Zoomdata builds each connector using the native APIs of each store, where available. This provides an advantage over other tools that use a generic ODBC or JDBC connector, Cappiello says.
“In other tools, you’d be spinning the hour glass waiting for the query to come back,” he says. “With Zoomdata, as soon as we have something, we’re able to show a piece of the visualization to you and it sharpens over time as you’re analyzing it. You don’t have to wait to take action. You can zoom down and see more detail even while that sharpening is happening.”
3. Make Multiple Data Sources Look Like One
You can’t push all queries down to the source. Joins, for instance, don’t work that way. One of the most important things analysts are doing with big data is creating “mashups” that combine multiple sources, and Zoomdata is supporting this need within its tool.
The company is currently building a new capability called Fusion that lets users combine two sources together. With Fusion (expected to become available next month), the company aims to enable regular data analysts to do sophisticated data manipulations that typically required skilled IT professionals.
“It makes multiple sources appear as one,” Cappiello says “We’re recognizing the reality that enterprises today are going to have a mix of different kind of data frameworks in the enterprise. They’re going to have big data they brought online, but they’re also going to enterprise data warehouses and analytic RDBMs that still have utility in the organization. But you’d like to be able to bring those together and make multiple sources appear as a single source.”
Zoomdata is only three years old and has a ways to go if it’s going to beat out other startups to unseat Tableau at the top of the visualization heap. The company, which has raised $22 million to date, has about 30 customers to date, one of which signed an eight-figure deal.
The Reston, Virginia company is growing fast, and now has more than 90 employees. But perhaps the best thing Zoomdata has going for it are its partnerships with Cloudera and Hortonworks, which together account for most of its leads. The company is working with Hortonworks on a Hive on Tez connector, which should be out soon and could bolster the Hadoop distributor’s efforts to increase Hive adoption.
Last month, Cloudera announced that Zoomdata would be included in its Cloudera Live sandbox running on AWS. Just two visualization tool vendors made the cut. “They picked Tableau for their reach in the market and excitement,” Halsey says. “And they have us because we’re the ones that actually work on large data sets.”