June 10, 2013

Using Hadoop to Augment Data Warehouses with Big Data Capabilities

Alex Woodie

If you run a data warehouse at your organization, you may be wondering how the latest big data technologies, such as Hadoop, can benefit your information analysis. According to IBM product manager Vijay Ramaiah, there are several ways that Hadoop and related tools can augment an existing data warehouse and deliver new analytical capabilities along the way.

Organizations that have already invested lots of time and money into building a data warehouse may be good candidates for augmenting their warehouse with a Hadoop-based system if they face one of several circumstances, Ramaiah, who is the product manager for IBM’s big data portfolio, says in a recent video.

When an organization is “drowning” in big data or throwing away data because it lack the capability to store and process it, that may signify a good time to front-end an existing data warehouse with a Hadoop repository include, Ramaiah says. Similarly, if an organization is using the warehouse to store all data, including cold or rarely accessed data, they may be better off shunting that data over to Hadoop. Organizations that want to analyze non-operational data; that want to explore large and complex sets of data; or that are looking to delay a data warehouse upgrade are also good candidates.

One effective way of using of Hadoop with an existing data warehouse is to use Hadoop as a “landing zone” for big, raw data, Ramaiah says. “Instead of taking all this directly into your warehouse or other aspects of your enterprise environment, what if you could bring all this data, land it in Hadoop, use it as a place where you can do some pre-processing of this data, and then determine if you take it on to other systems?” he asks in the video.

The second common job for Hadoop in existing data warehousing environments is using Hadoop to perform data discovery and analytics on combinations of structured, semi-structured, and unstructured data, including real-time streaming data (possibly in conjunction with IBM’s text analytics engine). Since most data warehouses require structured data, this is an area where Hadoop and other big data tools can bring net new capabilities to an organization.

The third common way customers with existing data warehouses use Hadoop is by using their existing query tools against the columnar data store. “It’s a very effective way to do analytics,” Ramaiah says. “The MapReduce technology provides great performance. What would previously take you weeks and days now takes minutes and hours.”

Ramaiah advises organizations to start small with their Hadoop-based data warehouse augmentations, and grow from there. Given the large volume, velocity, and variety of big data, most projects will benefit from master data management (MDM) and data lifecycle management tools.

Organizations can assemble the various components they need as projects and budgets dictate, eliminating the need for a “big bang” big data project, according to Ramaiah. IBM’s distribution of the open source Hadoop database, dubbed InfoSphere BigInsights, includes additional components and capabilities in the areas of text analytics, performance and workload optimization, data visualization, developer and administrative workbenches, enterprise application connectors and accelerators, and security.

Other big data products from Big Blue that might be used in a data warehouse augmentation project may include InfoSphere Information Server, Optim, and Guardium.

Hadoop Distros Orbit Around Solr

The Transformational Role of the CIO in the New Era of Analytics

Applications: Enterprise Analytics, Predictive Analytics, Research Analytics

Technologies: Systems

Sectors: Financial Services, Healthcare, Manufacturing, Other, Retail

Vendors: IBM

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Using Hadoop to Augment Data Warehouses with Big Data Capabilities

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 10, 2024

May 9, 2024

May 8, 2024

May 7, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Using Hadoop to Augment Data Warehouses with Big Data Capabilities

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

May 10, 2024

May 9, 2024

May 8, 2024

May 7, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link