Follow Datanami:
September 23, 2013

Oracle Gives 12c Database a Column-Oriented Makeover

Alex Woodie

Oracle says the new In-Memory Option it unveiled today will allow its 12c database customers to run analytic workloads 100 times faster than they previously could. The secret sauce is a new column-oriented analytical data store, which sits right next to the traditional row-oriented data store used for transactions.

The dominance of the traditional relational database management system (RDBMs) is crumbling as organizations struggle to capture, store, and process vast sums of unstructured and semi-structured data. The RDBMs is simply not up the task, which in turn is leading organizations to explore other types of data stores, such as Hadoop, NoSQL, and many others.

Column-oriented data stores are well positioned to inherit some of the analytic workloads that organizations tried to run on RDBMs. From a disk-access point of view, it’s much more efficient to read data stored in a handful of columns versus opening millions of rows to do the same. The catch is that this approach doesn’t work as well for transaction processing. This is why Oracle database competitors, including HP Vertica, SybaseIQ, Teradata, and others, have adopted columnar database as the basis for data warehousing and analytic workloads.

Oracle CEO Larry Ellison unveils the In-Memory Option at Oracle OpenWorld this week.

The latest trend involves cohabitating analytic and transaction processing workloads onto the same database, using an in-memory, columnar approach, which is exactly what SAP has done with HANA and what IBM is doing with DB2 BLU.

Instead of ceding this market opportunity to Hadoop and NoSQL data analytic startups and traditional foes in the RDMBs business, Oracle is hoping to retain its customers’ business by shoring up its database with the hybrid In-Memory option, which allows customers to run analytic workloads side by side with traditional OLTP workloads. The company made the announcement today from its Oracle OpenWorld 2013 conference taking place this week in San Francisco.

This hybrid, or “dual-format,” approach involves implementing a separate column-oriented data store that’s optimized for analytical workloads right next to the traditional row-oriented data store in the 12c database platform. Data is stored in both formats simultaneously, and the data is maintained in a manner that is transactionally consistent, the company says.

Oracle says the column-oriented approach, combined with its single instruction, multiple data (SIMD) vector optimizations, enable applications to scan billions of rows of data per second for each CPU core. This adds up to deliver sub-second response times for reports running against the column-oriented data store, the vendor says. This gives customers the capability to perform real-time ad-hoc analytics on live transactional data, it says.

All told, Oracle claims that analytics and reports run 100 times faster this way than using the traditional row-based approach. What’s more, the software giant says reports run 20 times faster than they do when using pre-defined OLAP cubes, while table joins run 10 times faster.

Oracle says transactional applications will see a 2x speedup as well, thanks to the column-oriented approach. This is due to the fact that most indexes in a database are there to optimize analytical queries. However, these indexes aren’t needed a hybrid database, like 12c has become. As a result, there’s less overall overhead to transactions, since the indexes no longer need to be maintained.

Oracle says that every application designed to run on 12c can “automatically and transparently” take advantage of the new hybrid data store. This includes all of Oracle’s own enterprise applications, including products like E-Business Suite, PeopleSoft, and JD Edwards. At the same time, Oracle is apparently also giving a big speed-up to competitors, such as SAP’s Business Suite, which is often deployed atop the Oracle database.

Getting the benefits involves nothing more than “flipping a switch,” Oracle CEO Larry Ellison said.

“Virtually every existing application that runs on top of the Oracle database will run dramatically faster by simply turning on the new In-Memory feature,” Ellison said. “Our customers don’t have to make any changes to their applications whatsoever; they simply flip on the in-memory switch, and the Oracle database immediately starts scanning data at a rate of billions or tens of billions of rows per second.

This looks very much like the SAP HANA database, which provides an in-memory, columnar database that simultaneously runs OLTP and analytic workloads. HANA was designed to, among other things, displace Oracle from SAP accounts, by allowing them to run analytical and transactional workloads on the same server. Now that Oracle has a similar story to tell, it will be interesting to see where it goes next.

There are still some unanswered questions about how 12c’s In-Memory Option will work in the real world. Does it do best when scaling up vertically, on big symmetric multi-processor (SMP) boxes, or does it scale horizontally, using Oracle RAC clustering? Does it run better on cheap X64 boxes or will Oracle somehow “optimize” it to run on its Sparc servers or any of the Exa-appliances? How big can data get with the In-Memory Option? Traditional RDBMs typically start to sputter when they get into the multi-terabyte range. Does the In-Memory Option help or hurt scalability in this regard?

The In-Memory Option for the 12c database is expected to become available later this year.

Related Items:

Analytics 3.0: A Blend of Big Data and Old Techniques

Six Super-Scale Hadoop Deployments

CFOs Stepping Up in Analytics Adoption