Follow Datanami:
June 18, 2013

LexisNexis Touts HPCC’s Benefits Over Hadoop

Alex Woodie

Apache’s Hadoop framework stormed onto the scene several years ago, and quickly gained a large chunk of the market for big data processing. But if the data and services firm LexisNexis has its say with its open source HPCC Systems offering, Hadoop won’t be the only big data platform in town.

LexisNexis, which owns one of the largest databases of legal and public-records related information in the world, launched its High Performance Computing Cluster (HPCC) offering more than 12 years ago to allow customers to do the same types of big data management and processing that LexisNexis offers as a service.

Then, Hadoop stormed the big data party, and the HPCC strategy had to change, according to Arjuna Chala, director of technology at HPCC System at LexisNexis. The big move: open sourcing the HPCC technology before Hadoop built an impenetrable lead.

“At that point [in 2011], we realized that we were at least three years ahead of the Hadoop technology,” Chala says in a video interview with Storage Switzerland. “But if we did not open source it at that point, we realized that, in five years, we might be looking at migrating our services to something like Hadoop. So the whole idea of open source our HPCC platform was to stay relevant by offering a much more superior product than what Hadoop has.”

Chala says the HPCC offering is mature at all levels–from the ETL processes on the front end, to the analytical processing and algorithms in the middle, and ultimately to the information delivery to customers. “The key thing to understand here is that all the analytics, all the analytical pieces that are needed for your typical data processing job, is built into the HPCC platform…In addition, we have all the data delivery capabilities built in. It’s production ready and production tested.”

Hadoop, on the other hand, he says, requires more work to mold into a working big data application. “In the case of Hadoop, you would have to hire MapReduce developers, you’d have to hire statisticians, and bring them together to make it work,” he says. “In our case, because we’ve worked on it for so long, we’ve built all the underpinning functionality and algorithms into the platform itself.”

Its partnership with Dell has made the hardware as turnkey as the software, Chala says. LexisNexis has worked with the Texas computer maker for years to bundle X86 hardware with the HPCC software. In the early days of the partnership, it yielded optimizations for hardware and networking issues LexisNexis faced. More recently, Dell has provided HPCC as a cloud service, Chala says.

Related items:


Big Data Backs World’s Largest Lie Detector

Hadoop Alternative Lands on Amazon’s Cloud

On Algorithm Wars and Predictive Apps