Language Flags

Translation Disclaimer

HPCwire Enterprise Tech HPCwire Japan
DataTorrent

October 17, 2012

Lavastorm Enlivens Platform with R


The era of big data has meant a resurgence of interest in the R statistical approach to analytics. Accordingly, a number of startups have aimed at increasing usability and functionality while still other companies—often established players in the analytics market--have sought to integrate statistical capabilities into their existing platforms.

Among the latter group is Lavastorm, which just announced it is providing an interface between its own analytics platform and the R statistical language, which allows users to boost their processes with the power of the sturdy open source stats language without the need to implement an internal R server.

As noted, this is not the first company to explore the options around commercializing a language that was rooted in academia for much of its development cycle. The last couple of years have revealed that companies like Revolution Analytics, for instance, are able to sustain a growing business on the promise of extending R’s capabilities beyond the traditional purview.

Whereas Revolution, for instance, has implemented performance and scalability improvements to core R through technology enhancements to the platform itself, Lavastorm is doing something quite different by offering a complementary solution that they say provides better tools for working with the data before R analyses are applied.

The company’s CTO, Rich Boccuzzi, described Lavastorm's foray into R as a means to tap into new stats possibilities, claiming that users of its existing platform can now execute R scripts using data housed in the Lavastorm Analytics Platform and receive results back into the platform for further integration and analysis.  Boccuzzi  is no stranger to the company’s analytics platform and engine, having been involved with the development process behind it since joining Lavastorm in 1999.

Boccuzzi believes that all the attention around big data is drawing more attention to analytics and business intelligence options, but also for more established approaches to analysis like trusty old R. As he told us, “One interesting facet of the big data explosion has been the expansion in the set of people who must become data analysts. This requirement isn’t usually satisfied cheaply or easily with software, and it often requires a technical skill set which traditional analysts may not have or easily obtain.”

Just as R itself has been around for ages, Lavastorm goes way back with big data analytics and business intelligence. With MIT research roots that blossomed into business shoots in 1993, the company found a home in the telecom industry in particular with rather basic database-driven operational tools. As the needs of businesses grew more complex, the company expanded into new markets with the advent of its Analytics Platform and Analytics Engine, which aim to merge clean and define diverse data types for analysis. They claim that their approach is robust enough for data-intensive projects like fraud detection, optimization and healthcare analytics with the ability to analyze 3 billion records per day across multiple analytic processes—although as we might imagine, it’s difficult to rely on such general numbers for such diverse data and application types.

He says the real business value of R for this group of users is that it “provides a great entry point for self-sufficient analytical work, since it doesn’t incur large infrastructure costs and it offers so much functionality off the (community-driven) shelf.  When you put this much power in more people’s hands, you see opportunities for application of sophisticated analytics where it would have been cost-prohibitive before.”

As Boccuzzi explained, the problems Lavastorm is trying to solve with its R tie-in trying to solve are revolve around the challenges of assembling data from across enterprise silos and federating them into a comprehensive and trustworthy foundation for R analyses. He says that to make this more seamless, the Lavastorm platform provides a visual front end for designing data acquisition, federation, and analysis applications and a powerful and scalable back end for processing these applications. 

“We see the Desktop flavor of our offering (which is available in a free Public Edition) as a great way for individuals to manage the data they’ll use with R, and also to enhance their R-based analytics by leveraging our set of components to ensure the integrity of the data,” he explained.  Boccuzzi noted that the visual, component-based nature of the company’s software allows R users to create drag-and-drop components which contain complex data manipulations and R analyses but which can be packaged for use by analysts who may not themselves be proficient in R.  For example, a user may combine a sequence of data filters and joins which prepare the data with a linear regression R script into a single node which a user may execute without knowing exactly how these operations were performed.  This allows organizations to make the R-based statistical analyses available and useful to a wider audience.  We believe the combination of our software with R represents a major enhancement to the way R is being used, and can broaden the set of users who can leverage R’s power.

In terms of pushing a business model beyond its bread and butter platform users, the Lavastorm CTO explains that the company is simply trying to provide users of R with a better end-to-end approach from data access through results presentation.  “You can use free R with the free version of Lavastorm software and be quite productive.  We see the commercial potential for Lavastorm Analytics’ R integration in the desire to work with larger data sets and to do more with the data before and after they are treated with R.  So while our R integration is now and will remain free, we believe that exposing users to our technology will provide a compelling offering for which they will pay a premium.”

While it may not address some of the ultra high-end needs of data and compute-intensive tasks as firmly as some solutions that leverage Hadoop, in-memory platforms or vast streams of complex data, for its purposes, R could potentially boost its use for mid-sized operations on the data mining and optimization as well as for building models for specific tasks.

Share Options


Subscribe

» Subscribe to our weekly e-newsletter


Discussion

There are 0 discussion items posted.

 

Most Read Features

Most Read News

Most Read This Just In

Cray Supercomputer

Sponsored Whitepapers

Planning Your Dashboard Project

02/01/2014 | iDashboards

Achieve your dashboard initiative goals by paving a path for success. A strategic plan helps you focus on the right key performance indicators and ensures your dashboards are effective. Learn how your organization can excel by planning out your dashboard project with our proven step-by-step process. This informational whitepaper will outline the benefits of well-thought dashboards, simplify the dashboard planning process, help avoid implementation challenges, and assist in a establishing a post deployment strategy.

Download this Whitepaper...

Slicing the Big Data Analytics Stack

11/26/2013 | HP, Mellanox, Revolution Analytics, SAS, Teradata

This special report provides an in-depth view into a series of technical tools and capabilities that are powering the next generation of big data analytics. Used properly, these tools provide increased insight, the possibility for new discoveries, and the ability to make quantitative decisions based on actual operational intelligence.

Download this Whitepaper...

View the White Paper Library

Sponsored Multimedia

Webinar: Powering Research with Knowledge Discovery & Data Mining (KDD)

Watch this webinar and learn how to develop “future-proof” advanced computing/storage technology solutions to easily manage large, shared compute resources and very large volumes of data. Focus on the research and the application results, not system and data management.

View Multimedia

Video: Using Eureqa to Uncover Mathematical Patterns Hidden in Your Data

Eureqa is like having an army of scientists working to unravel the fundamental equations hidden deep within your data. Eureqa’s algorithms identify what’s important and what’s not, enabling you to model, predict, and optimize what you care about like never before. Watch the video and learn how Eureqa can help you discover the hidden equations in your data.

View Multimedia

More Multimedia

Leverage Big Data

Job Bank

Datanami Conferences Ad

Featured Events

May 5-11, 2014
Big Data Week Atlanta
Atlanta, GA
United States

May 29-30, 2014
StampedeCon
St. Louis, MO
United States

June 10-12, 2014
Big Data Expo
New York, NY
United States

June 18-18, 2014
Women in Advanced Computing Summit (WiAC ’14)
Philadelphia, PA
United States

June 22-26, 2014
ISC'14
Leipzig
Germany

» View/Search Events

» Post an Event