Platfora Seeds Big Data Future in Openness
Platfora launched its end-to-end analytics application for Hadoop when the only other option was to build your own. To that end, Big Data Discovery has everything you need. But with today’s update to the tool–issued on the second day of the Strata + Hadoop World conference–Platfora is opening up the kimono a bit more in an effort to better integrate with popular tools in the ecosystem, namely Tableau and Spark SQL.
While Platfora‘s software comes with its own data visualizations, the company recognizes that some customers want to consume the analytic results that Big Data Discovery generates from the comfort of another business intelligence tool. In many cases, that tool is Tableau (NYSE: DATA)’s popular data visualization tool.
“There’s a lot of people who love Tableau,” says Peter Schlampp, vice president of products of the San Mateo, California company. “It’s the new Excel.”
With version 5.2, the company is making it easier than ever for Platfora users to export data to Tableau and its Tableau Data Extract (TDE) file format. The vendor does this by enabling data analytic results—or the big data “lens” that’s composed of a highly compressed in-memory format that makes it easy for Tableau users to open and manipulate the data.
Platfora did that specifically for Tableau, Schlampp says. “It’s great if you’re dealing with smaller data volumes. That’s what Tableau is specifically suited for, datasets from megabytes to gigabytes in size,” he says. “There’s a lot of results of big data analysis that need to be consumed that are of that size.”
Customers who are dealing with bigger data sets—those measuring in the terabytes to petabytes–will be interested in another new feature in Big Data Discovery 5.2 that also involves integration with another popular big data technology: Spark SQL.
The new Lens Accelerated SQL feature enables users to interactively query Platfora lenses through standard ODBC connections and common BI tools, including those from Tableau, Qlik (NASDAQ: QLIK, MicroStrategy (NASDAQ: MSTR) and TIBCO Spotfire.
“We’ve essentially built a way for Spark to read our lenses like a resilient distributed dataset (RDD),” Schlampp says. “The nice thing is you get the benefit of Platfora–the speed of building the lens, the incremental updates, security, management, lineage–all those things are still there. But now you can extend the benefits of Big Data Discovery to any BI tool.”
It also extends the software to a new class of user beyond the data scientist and citizen data scientists that work within Platfora to prep, blend, and analyze the data in preparation for deeper analysis. These information analysts wont’ be asked to learn how to use Platfora’s tool, but they can still work with the insight generated through the BI tool of their choice.
This isn’t the first time Platfora has leveraged Spark, of course, and it probably isn’t the last either. The company also uses Spark as an execution engine for data preparation routines (alongside MapReduce-based engine that still works very well for batch-oriented jobs on very large datasets), as well as to power data visualizations. “This is yet another way we’re using Spark,” Schlampp says. “It’s really well-written and is a very clearly defined extension point for us that’s easy to maintain.”
Big Data Discovery 5.2 also brings enhancements to how it operates with Hadoop. While Platfora is ostensibly a Hadoop-based application and ran the first stage of its analytical pipeline on Hadoop, the company has always kept a portion of the product—the part where the in-memory lens are interactively accessed and analyzed—on a separate cluster. With this release, the product can now run entirely on YARN (although customers still have the option to run the final analytic component on separate nodes for performance reasons if they so choose). Finally, this release brings enhancements to the Vizboards that present data to the end user.
At the end of the day, these actions show that Platfora—which just cleared a $30-million financing round late last year as well as brought in a new CEO, Jason Zintak—is looking to maintain openness in a burgeoning big data ecosystem that’s focused on open source. “This is the biggest release of the year for openness for us,” Schlampp says.