Follow Datanami:
July 23, 2012

The Velocity Key to Big Data Verocity

Ian Armas Foster

GridGain, a company headed by CEO Nikita Ivanov, is claiming to disrupt the big data management market with their in-memory systems.

In a recent briefing with analyst Dr. Robin Bloor, Ivanov noted the relevance of the three V’s of big data, but dismissed the volume and variety, essentially claiming that it was a given for big data to be voluminous and that there always new and different types of data. Ivanov only interested himself in velocity, which refers to the pace at which he could get results to customers.

GridGain, according to Ivanov, is in “Java-based open source middleware for real-time big data processing.” While everyone’s definition of real-time differs (Bloor’s definition of under a tenth of a second or quicker than a computer mouse’s lag time), a claim of real-time big data processing would indeed be disruptive.

At the core of Ivanov’s point about velocity is the issue of in-memory processing. More specifically, it relies on this simple fact: RAM is up to 100,000 times faster than disks, according to Bloor. Further, while RAM remains significantly more expensive than disks, its prices have been dropping 30% per year (Bloor projected it would take eighteen years for the cost to match that of disks).

“For a little less than $40,000, you can store in-memory every tweet globally for seven days,” says Ivanov.

While his praise for Hadoop is effusive, Ivanov refers to it as an effective data warehouse, a place where you can keep lots of data for a long period of time. GridGain, on the other hand, is trying to incorporate the processing into the data warehouse. “Fast data = data warehouse + in-memory processing,” he displays on a slide. “

A one-tenth second delay costs Citi up to a million dollars. A half-second delay costs Google 20% of its traffic, which Ivanov claims would cost Google billions of dollars. Clearly, obtaining faster processing speeds is a lucrative venture. Bloor, a third party analyst, evaluated Ivanov’s claims.

“This ends up being a much shallower hierarchy, which changes everything,” Bloor says. “But we’re not going to get there in one jump.” He, however, did agree with Ivanov that the infrastructure was not complicated, showing a slide in which operational apps and BI apps are talking directly to each other without going through third party disks, archives, or people.

GridGain is currently using their system for the following test cases according to Ivanov: Real-time DNA sequencing and matching, real-time geographical route and traffic information, live real-time insights and operational BI (he mentions Apple here), online gaming, analysis of trading positions and risk, handling large volumes of transactions.

“One client we have,” says Ivanov, “in London and it’s a very large financial organization. Essentially they’re doing an option pricing engine. With GridGain they’ve been able to move from a service bureau business model where they get the portfolios and the next morning they price them out to right now where they’re doing it real time.”

Datanami