August 15, 2013

Data Driving the Exit Into Hadoop

Isaac Lopez

Despite its long-term promise, one of the side comments often heard when discussing Hadoop is that it’s the king of the “proof-of-concept.” Virtually everyone is playing with Hadoop, but often, especially where established enterprises with entrenched relational databases are concerned, Hadoop stays at the sandbox stage.

It’s partly a crises of confidence, argues, Shawn Dolley, a VP with IT analytics company Appfluent. “Six months ago, I was in a session – there were about 100 people in the room, and the speaker asked the audience, ‘who here has tested, played with, and investigated Hadoop.’” Everyone in the room raised their hand, said Dolley, but when it came to how many had moved to production, there weren’t so many raised hands. “A lot of that is about confidence,” he argues, “where to go first?”

It’s a complicated challenge that the Hadoop distro vendors have to solve. With data in traditional systems growing at unprecedented rates, committing to a new paradigm can be daunting. “The trick is, which of the data sources, and which of the processes are most advantageous in which environment,” says Tim Stevens, VP of Business and Corporate Development with Cloudera. To help customers take that first step, they’ve partnered with Appfluent – a company who is finding new purpose building custom roadmaps into Hadoop.

Founded in 1998, Appfluent has been around for a long time offering analytic tools for the data warehouse aimed at diagnosing performance issues, increasing efficiencies, and generally getting the most out of existing resources. But with the rise of Hadoop, the company believes they’ve found a new niche in helping enterprises make data-driven decisions as they transition their precious bits into Hadoop.

Their tool is essentially a high-powered x-ray into the data warehouse. Using Appfluent’s diagnostic tool, enterprises are able to spelunk to the deepest depths of their database and come back with an information haul on virtually every aspect of the system, its data, and usage. Appfluent’s cataloguer dives in and logs every table, view, user, SQL statement – tracking both historically and into the future. With this data in hand

With this information in hand, these companies have a virtual custom roadmap based on their unique situation into the world of Hadoop, says Stevens. “Enterprises for a long time have been faced with the decision of determining what data should go into Hadoop,” he commented. “With Appfluent, we can work with customers to help them understand the totality of their data warehouse and data mart environments so that they know what data and processes they are able to migrate into [a Hadoop environment].”

When Expedia decided to cap their data warehouse at 200 TB, explained Dolley, they used Appfluent to help determine which data needed to stay put, and which data to move into the Hadoop overflow. According to a release from last Fall, Expedia now manages over four petabytes of data using Cloudera Enterprise.

In many cases, says Dolley, the company will find thousands of columns that are never or very rarely queried, taking up blocks of the database and ultimately hindering performance. In the past, these dormant columns might simply be jettisoned to free up space, but under the big data paradigm, that is treachery – there is no telling what kind of corollary gold might be found in those bits. “Those columns are the ones that need to be in a warm archive like Hadoop, rather than in an expensive data warehouse.”

Hadoop as a database paradigm of the future is virtually inevitable at this point, the question is the speed of adoption, particularly for entrenched businesses. It’s interesting to see companies like Appfluent, who have roots in the old world, transforming to ushers into a new one.

Bare Metal or the Cloud, That is the Question…

Manufacturing Real-Time Analytics on the Shop Floor

Applications: Research Analytics, Visualization

Technologies: Frameworks, Network, Systems

Sectors: Biosciences, Financial Services, Other, Retail

Tags: analytics, appfluent, Hadoop

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Data Driving the Exit Into Hadoop

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 22, 2024

April 19, 2024

April 18, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Data Driving the Exit Into Hadoop

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 22, 2024

April 19, 2024

April 18, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link