Oracle Taps Cloudera to Bolster Big Data Play
Today Oracle formally announced its most recent push into the big data ecosystem with the introduction of their Big Data Appliance. The integrated system pulls in the power of Hadoop via Cloudera’s distribution and set of management tools and kicks in the open source version of R.
While the company provided some details about a forthcoming Hadoop-powered appliance during their annual Open World conference (where, by the way, “big data” was the star of the show), today the full details and specs emerged—along with the ability for potential users to evaluate this solution independently and as a complement to their existing Hadoop setups.
“Cloudera brings us a couple of very important missing pieces, including its management software and assistance for a deeper second- and third-tier level of support,” George Lumpkin, Oracle’s vice president of product management, data warehousing, said in a statement today.
Oracle’s fresh big data product is nothing to sneeze at, performance-wise (at least from the specs). 216 cores and 648 TB of raw disk make this 18-node system ready to handle some of the real-time and data-intensive workloads they’re designed to tackle. Also, Oracle outfitted the appliance with 40 Gb/s Infiniband between nodes and 10 Gb/s Ethernet for data center connectivity.
Many hardware and software vendors are pushing to support Hadoop integration in their components as well as across engineered systems like the Big Data Appliance, including most recently, SGI, which produced a somewhat similar 20-node Xeon/Rackable version of Oracle’s offering .
This adds to a growing host of Oracle’s big data flavored products, which now include their Exadata Database Machine, their Exalogic cloud and their Exalytics in-memory machine.
The Hadoop layer is strung in via the Big Data Connectors software layer, which the company says will let users easily integrate their data stored within Hadoop and Oracle NoSQL databases.
Oracle notes that the Big Data Connectors software is available for use with both Oracle Big Data Appliance and other Apache Hadoop-based systems. The bundle includes:
- Oracle Loader for Hadoop which uses MapReduce processing to load data efficiently into Oracle Database 11g;
- Oracle Data Integrator Application Adapter for Hadoop which enables Oracle Data Integrator to generate Hadoop MapReduce programs through an easy-to-use graphical interface;
- Oracle Connector R which gives R users native, high performance access to Hadoop Distributed File System (HDFS) and MapReduce programming framework; and,
- Oracle Direct Connector for Hadoop Distributed File System (ODCH), which enables the Oracle Database SQL engine to access data seamlessly from the Hadoop Distributed File System.
As one might imagine, the system runs on Oracle Linux and provides the community editor of its own NoSQL databases and their HotSPot Java VM.
According to Doug Henschen, “The hardware and software combined will sell for $450,000, with an annual support fee for both hardware and software of 12%. That’s highly competitive, working out to less than $700 per terabyte and being in line with the low costs big data practitioners expect from deployments built on commodity hardware.”