April 4, 2014

How to Speed Your Data Warehouse by 148x

Alex Woodie

Data analytic applications built on a combination of IBM’s DB2 BLU column-oriented data store and servers equipped with Intel’s latest Xeon E7 v2 processors can run up to 148 times faster than if they were running on the standard row-oriented DB2 engine and first-gen Xeon E7 processors, the companies announced last month.

The big data buddies clocked the colossal speedup in January during an internal test session that utilized the Proof of Performance and Scalability (POPS) benchmark running on a 10TB star-schema database, which is often used in traditional data warehouses.

IBM and Intel broke the 148x figure into constituent parts in a white paper released in mid-March. The addition of the DB2 database with BLU acceleration to the standard DB2 10.1 database running on Xeon E7 processors was responsible for a 77x boost in performance. Then, moving up to Xeon E7 v2 processors provided a 1.9x performance pop, which gives us the final 148 figure.

IBM and Intel worked together to ensure that the BLU engine can take advantage of the latest parallelized goodies in the multi-core Xeon E7 v2 processors, including Intel Advanced Vector Extensions (AVX) and Streaming SIMD Extensions (SSE) instructions. Jantz Tran, an Intel performance application engineer who has an office at IBM’s labs in Silicon Valley, explained the significance of this approach in a recent blog post on the Intel website.

“Packing columnar data into SSE registers allows you to use memory pools much more efficiently than row-based stores because you can run queries and evaluate data while it is still compressed,” Tran says, “In fact, data compression with columnar store is so much more efficient it requires a lot less memory to run the same data set. So you can house a much larger columnar database on a much smaller memory footprint.”

The actionable compression capability, in fact, enabled the 10TB data warehouse used in the POPS test to be 4.55 times smaller than if it were using standard static compression methods. “So if you have 10 TB of raw data and 2 TB of memory, you can run it as an in-memory database using DB2 with BLU Acceleration and Intel Xeon E7 v2 processors,” Tran says.

In other words, you can have your cake (faster query times) and eat it too (less hardware required). “The bottom line: These technologies allow you to run large primary databases directly in-memory at orders-of-magnitude improved performance,” Tran says.

IBM unveiled its BLU Acceleration database option exactly year ago, and shipped it first for AIX on its own Power processors and later for Linux on Intel processors. The column-oriented data store employs a combination of techniques, including in-memory caching, compression, and data skipping, to dramatically accelerate some types of queries. It’s a different technique than Microsoft took with the Hekaton in-memory feature it recently unveiled with SQL Server 2014, in that, while BLU is a separate column store that uses in-memory caching, most of the data remains on disk, whereas Hekaton is enabled in the database proper, and also allows data to be stored either in-memory or on-disk.

The Xeon E7 v2 processors can support up to 1.5 TB of memory per socket, which is three times the amount of memory supported on the first gen Xeon E7 processors. That gives eight-socket E7 v2 chip the capability to support up to 12 TB of memory. As in-memory systems become more affordable and generally accepted in the industry, IBM is poised to capture a share of the market for both transactional and analytical systems.

IBM Announces “BLU Acceleration” and PureData System for Hadoop

IBM Points to Blueprint for Big Data Analytics Value

Applications: Enterprise Analytics

Technologies: Processors

Vendors: IBM

Tags: columnar, data warehouse, in-memory

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

How to Speed Your Data Warehouse by 148x

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 22, 2024

April 19, 2024

April 18, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

How to Speed Your Data Warehouse by 148x

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 22, 2024

April 19, 2024

April 18, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link