Alpine Demos Big Data Analytics from an iPad
Alpine Data Labs is one of many vendors at the Strata + Hadoop World conference offering a library of statistical algorithms designed to whittle big data sets down into useful information. It’s tough not to hit such a vendor if one were to throw a ball at the expo. But Alpine is one of the few big data vendors demonstrating its analytics running atop an iPad.
Alpine is at Strata to preach the message of simplification. The Vizio-like user interface in the latest release of its product, called Alpine 3.0, is designed to be used by regular run-of-the-mill data analysts to perform big data activities like churn prediction, product recommendation, and fraud detections.
This prevents organizations from needing to find, recruit, and hire Ph.D. level data scientist with proficient Java programming skills, who are few and far between in this big data age. What’s more, Alpine’s Web-based interface can be accessed from any device with a Web browser, including smartphones and tablets (although it may be a stretch to use this with a smartphone). Of course, the meat of the company’s offering–the part that contains the library of “operators” that appear as icons on the interface–won’t live on an iPad. Instead, it sits on the same cluster that houses the customer’s big data set.
Alpine chief marketing officer Bruno Aziza says those two items–the in-database approach and the ease of use on the UI front–are what define the company and its product.
“We started out saying, surely there has to be a way to go after business analysts with script-less solution, and surely there’s got to be a way to run everything on Hadoop,” he tells Datanami. “So if you have Hadoop, we’re running the logic directly on your cluster. You don’t have to move data anymore, and you’re doing analysis at scale, on the entire data set.”
Alpine’s software can sit on the Hadoop platforms from Cloudera, MapR Technologies, and Pivotal. It also supports traditional databases and data warehousing systems, including those from IBM, Oracle, PostgreSQL, and Greenplum. That list would undoubtedly increase, depending on customer requirements.
|Alpine’s Vizio-like interface is apparant in this screenshot from the software running on a laptop.|
While intuitive UIs are nice, the ability to crunch data is really what customers are after. To that end, the heart of the Alpine solution contains a library of predictive analytics, machine learning, and statistical algorithms that users can call as operators in the Vizio-like GUI. This allows users to “drag and drop transformations into Hadoop,” Aziza says. “So you’re able to allow a normal person to write instructions for Hadoop without knowing MapReduce, which is huge,” he says.
In terms of the algorithms themselves, Alpine has about 80 percent of what most customers will use most of the time, Aziza says. The list of algorithms and data model operators includes: K-means, logistic and linear regression, naïve Bayes, decision tree, cart, neural network, SVN, time series, ROC, Lift, and goodness-of-fit components.
Alpine gives users other built-in functions to aggregate and explore their data sets, including functions such as value analysis, frequency analysis, histograms, aggregates, row filter, and derived columns. Data transformation functions include pivot tables, numeric-to-text, aggregate, row filter, normalization, and quartile columns, while it also includes data correlation and information value analytical functions.
The company won’t pretend that it has the library of statistical algorithms of, say, the SAS Institute or IBM’s SPSS. “We have a fair amount of what they have,” Aziza says. “To be honest, they have more. It’s a maturity thing. SAS has been in the market since 1976. But we’re getting there.”
There is a green tinge to Alpine, as the company’s two co-founders, Anderson Wong and Yi–Ling Chen, both hail from Greenplum, the big data analytic appliance maker that’s now owned by EMC. Also coming directly from Greenplum are Joe Otto, Alpine’s CEO, and Steve Hillion, who led some of Greenplum’s early work on simplified data analytics workflow and joined Alpine as chief product officer in 2012.
The San Mateo, California, company was founded in 2010, and has raised $7 million so far. Early investors include Scott Yara, the president and co–founder of Greenplum, in addition to Mission Ventures, Sierra Ventures, Stanford University, and Sumitomo Corporation Equity Asia Limited.
The company is still early in its development phase, and has just a handful of customers. But some of them are big, including Ford, Disney, and Barclays. Another early adopter, the French advertising firm Havas Media, is using Alpine to help build better advertising campaigns for its clients.
Sylvain Le Borgne, the executive vice president of data platforms at Havas Media, says Alpine has helped give it a competitive advantage. “With Alpine, our teams of business analysts can run complex simulations very rapidly and share the results of their work on the Web, across the company with sales, marketing and other business functions,” he says.