Language Flags

Translation Disclaimer

HPCwire Enterprise Tech HPCwire Japan
Leverage Big Data'14

October 26, 2012

Visualizing Big Data's Key Partner


Visualization is vital to managing big data. The proper charts, graphs, and other representations of large datasets can let business users see trends they would not know existed otherwise. And after having a busy week at Hadoop World, Tableau appears to be on the forefront of the big data visualization market.

We caught up with Dan Jewett, Vice President of Product Management, to talk about the implications of the upcoming partnerships, which they announced over the course of the week, on the company that specializes in visualization.

For Jewett, one of the more prescient themes of the big data spectrum is performance. “Performance is critical and we’re trying to see a renewed emphasis on performance for all these vendors,” said Jewett with regard to determining which vendors Tableau decided to partner with. One of their biggest partners, and one with whom they joined to make a big announcement yesterday, is Hadoop distribution power Cloudera.

 “We’ve actually been supporting connecting to Cloudera for a little over a year now,” said Jewett of what they consider to be one of their more important partners. Cloudera announced yesterday that their ambitious new Impala platform that aims to turn Hadoop from solely a batch processing tool to one that can provide analysis in real time.

Tableau is to be the major visualization partner behind Impala, a product outlined by Cloudera CEO Mike Olsen here. According to Jewett, Impala processes data an order of magnitude faster than before. For business users looking to leverage that speed, the visual analytics side has to keep up. Jewett and Cloudera hope Tableau is up for the challenge.

Tableau also announced a significant first-time partnership with Hortonworks. With that announcement, Tableau completes the major Hadoop distributor hat trick, as they also work closely with MapR. Until some other framework overtakes Hadoop in the open source big data realm, Tableau has as a result more or less established itself.

Along with Cloudera and Hortonworks, Tableau announced several new partnerships which involve unstructured data (with an emphasis on text analytics), collaborative data science efforts and more. For example, Greenplum’s open source Chorus initiative, which is hooked up to the Kaggle data science competition community, is having their visual side powered by Tableau. Jewett also mentioned Digital Reasoning, a start-up focused on bringing unstructured data, and text in particular, into the realm of the structured.

Underlying those connectors is a company that Tableau has a good relationship with in Simba. According to Jewett, Simba serves as a conduit for delivering the data from the databases to Tableau. “Simba provides a couple of things,” Jewett said “including the raw transport from us over to the databases and doing some additional translation of the statements that gets sent through the driver to be appropriate for the database that it’s talking to. For example, with the Hadoop guys, it’s a driver that talks to Hive, so it’s the transport from the client tool over to the Hive engine on the back-end server.”

This means that Tableau would be able to deal with the data on its own terms with external drivers such as Simba (Jewett noted that Simba was not the sole driver but the one they work most with) delivering and translating the data between the databases, such as the likes they’ll be working on with Cloudera, and Tableau.

“We send out SQL or derivative forms of SQL out to the different databases that we talk to,” said Jewett regarding the specifics of the translation processes happening along Simba between the databases and Tableau.

Going deeper into what Tableau wants to do, visualizing sizable swaths of data is difficult. As Jewett notes, a normal human cannot simply look at several million rows that constitute a large dataset and pick out the trends. Nor is it particularly obvious which functions should be run on those datasets to produce any sort of insight.

Jewett spoke to these challenges in reference to someone trying to generate analysis from petabyte storage systems. “If you dumped two million rows of data, if you run a query through a petabyte system and you’re getting back two million rows of data that were the subset of the information that answered your question, there’s no way that looking at a grid of that data would help you out. As you take that information and you look at it visually, trends, outliers, nuanced patterns that are in that data start jumping out at you pretty quick and it allows you to continue your cycle of iteration.”

Tableau has a decent pedigree in the big data world to back up the claims they make. In a recent interview, CTO of Hadapt (a big data startup which is trying to allow users to input analysis queries without having to write them in MapReduce), Philip Wicklin mentioned that their BI clients are overwhelmingly choosing Tableau. That endorsement is not insignificant for Tableau when their competitors QlikView and Splunk also feature high profile connections.

Of course, for Jewett, the challenges roll back to performance. “Performance on what’s happening on the back end to respond back to you is critical. It’s kind of a buzzkill if you’re going through an iterative process and it takes twelve minutes between every question you ask. It’s hard to get into that flow of exploring your data.”

Like a good statistical analysis, a good visual analytics tool flits its eyes and says, “look over there,” except here it determines actionable enterprise trends. Tableau appears to be doing the same thing both with their tools and their announced partnerships, seemingly on the pulse of the massive data visualization realm.

 

Share Options


Subscribe

» Subscribe to our weekly e-newsletter


Discussion

There are 0 discussion items posted.

 

Most Read Features

Most Read News

Most Read This Just In



Sponsored Whitepapers

Planning Your Dashboard Project

02/01/2014 | iDashboards

Achieve your dashboard initiative goals by paving a path for success. A strategic plan helps you focus on the right key performance indicators and ensures your dashboards are effective. Learn how your organization can excel by planning out your dashboard project with our proven step-by-step process. This informational whitepaper will outline the benefits of well-thought dashboards, simplify the dashboard planning process, help avoid implementation challenges, and assist in a establishing a post deployment strategy.

Download this Whitepaper...

Slicing the Big Data Analytics Stack

11/26/2013 | HP, Mellanox, Revolution Analytics, SAS, Teradata

This special report provides an in-depth view into a series of technical tools and capabilities that are powering the next generation of big data analytics. Used properly, these tools provide increased insight, the possibility for new discoveries, and the ability to make quantitative decisions based on actual operational intelligence.

Download this Whitepaper...

View the White Paper Library

Sponsored Multimedia

Webinar: Powering Research with Knowledge Discovery & Data Mining (KDD)

Watch this webinar and learn how to develop “future-proof” advanced computing/storage technology solutions to easily manage large, shared compute resources and very large volumes of data. Focus on the research and the application results, not system and data management.

View Multimedia

Video: Using Eureqa to Uncover Mathematical Patterns Hidden in Your Data

Eureqa is like having an army of scientists working to unravel the fundamental equations hidden deep within your data. Eureqa’s algorithms identify what’s important and what’s not, enabling you to model, predict, and optimize what you care about like never before. Watch the video and learn how Eureqa can help you discover the hidden equations in your data.

View Multimedia

More Multimedia

NVIDIA

Job Bank

Datanami Conferences Ad

Featured Events

May 5-11, 2014
Big Data Week Atlanta
Atlanta, GA
United States

May 29-30, 2014
StampedeCon
St. Louis, MO
United States

June 10-12, 2014
Big Data Expo
New York, NY
United States

June 18-18, 2014
Women in Advanced Computing Summit (WiAC ’14)
Philadelphia, PA
United States

June 22-26, 2014
ISC'14
Leipzig
Germany

» View/Search Events

» Post an Event