June 13, 2012

ParAccel Lifts Hood on CARFAX Overhaul

Nicole Hemsoth

ParAccel is one of only a handful of analytical platform and database vendors that haven’t been snatched up in the big data analytics free for all that pushed big purchases from HP, IBM, Oracle and others.

Today the company announced a new Hadoop connector, along with a fresh crop of user stories that include a web services company (Evernote) and healthcare information company, Alliance Health Networks, which chose ParAccel in place Hadoop to handle its big diverse data that connects patients who share similar conditions.

In advance of the announcement, we had an extended conversation with the company’s Chief Operations Officer, Paul Zolfaghari. While the connector story is noteworthy for the company’s overall strategy, we wanted to back up and better understand how companies contending with large, complex datasets are hitting a performance wall with traditional database and appliance offerings from titans, as the more general role of high performance hardware for analytical database customers.

To sharpen the lens on ParAccel’s approach, we can look at their recent overhaul at CARFAX. The vehicle history giant houses over 10 billion records, the result data that’s been snapped up from other 34,000 data sources, including all U.S. and Canadian vehicle agencies, most auto auctions, police and fire departments, collision repair facilities, rental agencies.. This granular vehicle history data is then turned around quickly to suit consumers and dealers—a task that requires a heavy-duty database and some high performing hardware—that is, if the customers are hoping to get their data in a reasonable amount of time.

The problem, however, is that with those roughly 10 billion records, CARFAX hit a performance and scalability wall. The company wasn’t delivering on its SLAs, says Zolfaghari, which meant a serious overhaul of their approach to handling and processing data. This meant that they and had to look outside its legacy Oracle databases to scale with increasing data volumes, data complexity, and overall speed of data delivery.

As with any large-scale analytics installation, this wasn’t a simple matter of finding one problem and plugging in a new component solution. Even if it was that simple, there were some stiff requirements that wouldn’t have lent to the use of an appliance or some other database solutions.

First, CARFAX told the several companies scrambling for the contract (among which was Vertica—Zolfaghari was mum on the others) that they needed at least a 10x performance gain to meet its SLAs. Further, they stipulated that whatever they stuck with had to tap commodity hardware so they wouldn’t have another scale wall. Also, they demanded ecosystem friendliness–the solution would have to operate with ease inside the existing environment (an Informatica and Cognos blend).

Zolfaghari says that the challenges that CARFAX faced when it realized it wasn’t able to scale with the growth of data mirror the challenges others with strained, legacy operations are trying address. He says that all the traditional database and management systems that handle mixed workloads are designed for transactional, TLTP-type activity and were never built with analytics in mind.

The vehicle history company vetted through several companies, including Vertica, before settling on ParAccel. He says that on the simples operations, CARFAX got a greater than 10x performance boost, but for those tough, complex problems (where it really needed something that went beyond a traditional database) he says they got a 240x performance increase, meaning that some of the serious crunching that used to take days cooked in hours.

Zolfaghari says that what this represents is how having a “large-scale legacy relationship as with the IBMs and Oracles of the world means that as your data volumes and the complexity of the questions you’re asking of it grow, the legacy technologies stop stretching.” He says that this is why his company’s platform, which is “massively parallel, columnar and hardware agnostic” is finding converts.

With so many similar analytical platforms finding a home in the open arms of industry titans (as was the case with Vertica, for instance) one has to wonder how ParAccel has maintained its independence. Zolfaghari says that their lack of acquisition power means that they are a standalone “best of breed” company. He told us that “if you look across the enterprise software industry, there are always a few standalones that represent the best—the companies offering something that really matters.” In this case, he says what “matters” is their integration and openness. He points to how ParAccel’s ability to work with “all the BI tools, all the ETL tools and all the hardware backends” is what gives them their zing.

His contention is that while the Oracles of the world and appliance vendors are trending with their Exadata-like platforms, they are overlooking something important that he says customers are clamoring for—the ability to choose their own hardware and software; to run on commodity systems that can scale across the board when data and business grow.

The CARFAX example does a little more than highlight the general flaw of legacy systems as data complexity and size grows unchecked, says Zolfaghari. He says that on a more granular level, it shows how some ultra-complex algorithms respond to columnar, MPP approaches as in the case of fraud detection.

Really, what CARFAX is doing on some levels is some big data fraud detective work. They’re taking a look at data across the 10 billion records and their algorithms crawls through to determine if a vehicle’s history is accurate—a core part of the “certification” service that’s central to their business model. This requires combining two large tables, one which is something on the order of 70 million records, another that’s in the 220 million record range—and steadily growing.

He says that when Dell or others come out with a new higher-core, high-memory server, users want the flexibility to tap those benefits right away. This isn’t possible with a proprietary hardware/software package.

The venture-backed company, which was founded in 2005, seems to have a clear story for large-scale analytics customers, but only time will tell how that message plays out against the din from the titans and the appeal of open source frameworks.

Partnership Targets BI Scalability

Six Super-Scale Hadoop Deployments

Applications: Enterprise Analytics

Technologies: Frameworks, Systems

Vendors: ParAccel

Tags: analytical, analytical platform, carfax, data, database, IBM, legacy, oracle, paraccel, performance, vertica

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

ParAccel Lifts Hood on CARFAX Overhaul

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

ParAccel Lifts Hood on CARFAX Overhaul

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link