Follow Datanami:
December 16, 2013

WANdisco Plots Growth Solving Hadoop’s NameNode

As 2013 comes to a close, the database wonder kid Hadoop is strong as ever. However, the strength of Hadoop isn’t going to mean everyone selling it survives. The year 2014 looks be very rocky road for a lot of the pushers in the over-crowded Hadoop marketplace. One lesser-known Hadoop vendor, WANdisco, isn’t too concerned.

Since Cloudera came on the scene with the first commercial release of its supported Hadoop distribution, the market has exploded with Hadoop offerings as vendors ranging from start-ups to industry giants have moved to capitalize on the framework. As some of the larger dogs in the pack race towards IPOs, others are left to find a niche before the air runs out of the room.

WANdisco, a solution provider in the Hadoop arena, seems to have found its safety spot by attacking a specific critical problem area within the Hadoop framework, and endeavoring to master it. While the other vendors in the space rushed to make their distributions feature rich so as to entice enterprises with impressive outlays and capabilities, WANdisco focused its efforts largely on a single problem: the troublesome NameNode — the Achilles’ heel of the elephant. It’s been a problem that has largely prevented the framework from becoming the large scale transactional system that vendors hope it can be.

A centerpiece of the Hadoop HDFS file system, the NameNode serves as the directory for the Hadoop mall, telling the system where the file data is kept across the cluster. If the NameNode goes down, the system is effectively unusable – a critical flaw for systems that require high availability. WANdisco says it’s solved this problem, creating what it calls a “Non-Stop Hadoop.” This fall, it’s added two of the largest names in the Hadoop sphere, Hortonworks and Cloudera, as partners, effectively validating its solution and signalling that WANdisco will be around for some time to come.

“We don’t actually want our own Hadoop distribution,” David Richards, WANdisco CEO told Datanami in an interview last week. “We think we can leverage vendors like Hortonworks and Cloudera much better.” Tagged on to two of the biggest names in the Hadoop-sphere, WANdisco has gone from an obscure Hadoop vendor to being a key enabler for the future of the commercial framework.

“The bet that we placed was that Hadoop would go from batch processing, Twitter, Facebook, Instagram, Pinterest, LinkedIn (and so on), cheap storage, to a transactional system where we would see high volumes of storage and transaction processing come together as a single application,” Richards told us. “In order to see that come to fruition, then Hadoop has to move from a batch processing system to a transaction processing system. I think that’s precisely what we’re seeing in the marketplace begin to happen.”

Doug Cutting, the father of the Hadoop technology, clearly agrees. “The prediction we can make here is it’s inevitable that we’ll see just about every kind of workload move to this platform, even online transaction processing,” he told an audience at this Fall’s Strata + Hadoop World conference, referencing some of the work being done in the space, including a paper Google published showing that OLTP can run on a Hadoop style system.

Richards says it’s just a matter of time. “We’re seeing everything from banks and trading systems look at this,” he told us. “We’re seeing fraud analysis systems, and other applications where large amounts of storage is part of the transaction. If that’s the case, continuous availability isn’t a ‘might need’ – it’s an absolute ‘must have.’ Because that’s what’s in place with transaction processing systems today, and our value in this marketplace is that we are the only enabler of continuous availability of Apache Hadoop period.”

Indeed, the two key problems for Hadoop as an enterprise-ready system have been the NameNode problem, as well as a fundamental lack of security that has plagued the framework until just recently. While both problems have been feverishly worked on, WANdisco, and seemingly its two high profile partners, believe that they’ve put the screws to the first one, opening the door wider for greater enterprise adoption.

The year 2014, says Richards, will be a banner year for that march. “I don’t even think that we’ve really gotten going in terms of enterprises deploying Hadoop as a sort of data centric operating system, if you will,” he told us, noting that Hadoop 2.0 and its centerpiece YARN will be a big part of what gets that going. “We are certainly seeing Hadoop move from a batch processing system to a transactional system. YARN is really the enabler for third party vendors to plug their technologies into Apache Hadoop.”

“I expect to see pretty large scale deployment of Hadoop in production environments that replace traditional technologies,” predicts Richards for 2014. “I’m not just talking about the data warehouse – I’m talking about transactional systems. This could be a generation shift in the same way that client server disrupted mainframe. I would expect to see Hadoop disrupt the client server marketplace. This is Haley’s Comet in IT terms – you only see this once in a generation.”

So while the air in the Hadoop room starts to run out in a crowded marketplace, Hadoop appears to have a very bright future for the vendors that are able to muscle out the others. WANdisco appears to have found a lucrative niche that positions it well for the long term.

“I don’t think survival is our objective,” Richards told us. “Our objective is hyper growth.”

Related items:

OLTP Clearly in Hadoop’s Future, Cutting Says 

Ebay: NoSQL and RDBMS Playing Well Together 

Hadoop Version 2: One Step Closer to the Big Data Goal 

Datanami