Follow Datanami:
November 25, 2013

Driving from Insight to Application

Isaac Lopez

Having valuable data is one thing. Leveraging it into something useful is something else. Even when a company has refined valuable data analytics from raw data, piping the insights back into a customer-facing application where it can be leveraged for business good can be daunting. A company called WibiData says it’s built a platform to cut through some of this hassle.

WibiData is a Cloudera spinoff company that aims to grease the rails between insights gleamed from data in Hadoop and the applications needing to leverage it. Omer Trajman, VP of Operations with WibiData, told Datanami that turning data into something that a company’s applications can leverage is a challenge that every organization in the data landscape faces, but rarely manage to handle particularly gracefully.

“What we see repeatedly is that while people are able to collect lots of data, and now especially with all the SQL on Hadoop, they can analyze lots of data,” he explained. However “in order to get that model back into the application itself could take weeks or months.” The process often requires a data scientist explaining to an engineer how to rewrite code that may be built in MATLAB or Python into Java. “After that, it needs to get plugged into the deployment cycle,” said Trajman, “and then structure the site around accessing that predictive model.”

It’s a complicated process that Trajman says can cause a lot of pain and significant costs as companies try to get front-end use from their data streams. And even if the companies manage to complete the loop, in many cases the solution they apply is batch driven, and thus limited in its ultimate usefulness because the models don’t react to changes in the customer dynamics, muting the impact of the analysis.

WibiData has created a platform, WibiEnterprise, that it says helps to bridge the gaps between the application developer and the data scientist by bringing schema back to NoSQL – without breaking its flexibility. “The challenges people have with relational schemas is they are very tightly coupled between one application and one data store,” says Trajman. “If you have a mobile app and a web site, they’re probably not sharing data in the same database. You have some complex enterprise service bus that’s moving data between them. You have an experience on the mobile side, and it’s not immediately available on the web side.”

Using Schema management in WibiEnterprise, Trajman says that both applications can talk to the same data, even if they’re running against different versions of the schema. “So it’s like you deploy a new mobile app and you need version 2 of the schema to capture new fields. WibiEnterprise handles that transparently,” he told Datanami. “You can evolve on the fly without downtime. Your mobile app will have new fields. Your website is still going to see the old schema, but they’re both talking to the same data, and they can both share data.”

The results are a system in which an application developer can leverage data coming in from both mobile and web apps, tying them together using a standard REST interface. Meanwhile, the data scientist can use the same exact system to build and experiment with their models, and see the results of their experiments in real time. “It’s the way the titans of the web are operating,” says Trajman,” but now it’s available to everyone.”

While WibiData originally launched its platform in a closed, proprietary format, last year the company turned its efforts towards building an open source community around its capabilities and functions. The results of this is the Kiji Project, an Apache 2.0 licensed endeavor that enables developers to download and utilize the framework at no cost.

Today the company launched WibiEnterprise 3.0, which effectively ensconces the movement that’s been made with Kiji in a supported format with commercial license terms. In the future, Trajman says that they expect to introduce new enterprise capabilities such as dashboards and monitoring that won’t be in the open source version.

Ultimately, both versions aim to do the same thing, and that’s shrinking the complexity in turning analysis into application. Who can argue with that?

Related Items:

Yahoo Unveils SAMOA to Mine Multiple Data Streams 

Amazon Tames Big Fast Data with Kinesis Pipe

LinkedIn Open Sources Samza Stream Processor

Datanami