Follow Datanami:
October 17, 2013

GridGain Puts Thrusters On MapReduce With Hadoop Accelerator

Isaac Lopez

In-memory computing has been all the rage in 2013 as vendors far and wide line up to offer businesses the ability to get more speed for their processing dollar. GridGain, a company that has built itself around the in-memory concept, say it’s bringing it to Hadoop with an accelerator that it says gives thrusters to the framework.

The offering, called the “In-Memory Accelerator For Hadoop,” is essentially an in-memory replacement for the standard HDFS, as well as in-memory MapReduce. “Just replacing HDFS isn’t going to cut it because a lot of time is spent in the overall MapReduce infrastructure for Hadoop,” GridGain Founder and CEO Nikita Ivanov told Datanami. “Our in-memory Hadoop accelerator is basically composed of two parts – one is a plug and play replacement (or extension) of HDFS and high performance in-memory file system. [The other] is an in-memory implementation of MapReduce.”

Ivanov says that the two pieces together are creating a powerful performance boost for virtually every flavor of Hadoop distribution available. “Just changing HDFS would probably get you 2 or 3 times faster results at best,” says Ivanov. “With the optimized in-memory MapReduce implementation, that’s when we can show you up to 100 times performance increase on the MapReduce jobs.”

Ivanov claims that they’ve already got implementations of the accelerator in the market. “We have a public company we’re working with – you would know the name,” he says. “They have a big Hadoop-based project that is a product for them that they sell. We’re showing anywhere between 7 and 15x performance gains…it’s a business changer for them.”

The accelerator, which just came out of beta this past summer, is one of several in-memory offerings that GridGain has as part of what they call the “first end-to-end in-memory stack,” an integrated set which also includes their “In-Memory Database,” “In-Memory Streaming,” and “In-Memory HPC” offerings.

“There are different payloads, and no matter if they’re all in-memory, they all require different approaches,” said Ivanov. “I think that’s one of the stabbing points why, for example, SAP HANA isn’t selling as much or traditional databases aren’t selling as much because system of records as a data store is just one use case.”

“We have a database, and everybody else has a database,” he continues. “But what if you have streaming? What if you have normal Hadoop that you want to accelerate? What if you have typical HPC performance problems? Databases don’t do anything for you – they actually just make things more complex for no reason. I think that’s what we bring to the forefront – just because we have a product we’re not trying to solve all the problems with that product. We developed the individual product that addresses individual particular problems of particular payloads that enterprises have today.”

(Editor’s note: SAP claims that HANA is their fastest growing product of all time, though there have been questions about the way these claims have been calculated. SAP says they are expecting HANA software revenue of €650 – €700 million in 2013)

With an entire solution stack aimed at in-memory computing, GridGain is playing in an ever-crowding space with plenty of big fish swimming around. For its part, GridGain was able to secure $10 million in Series B funding this past July in a round led by Almaz Capital, bringing their war chest to $12.5 million raised over the last two years.

Pricing on the GridGain offerings runs on a per unit basis, and can be expected to be anywhere from $25k for a small installation, to 7 digits for businesses running thousands of nodes.

Related items:

Please Stop Chasing Yellow Elephants, TIBCO CTO Pleads 

SQLstream Analyzes Data On the Flow 

Playing Stadium Tycoon with HANA and Hadoop

Datanami