Cloudera, Dell, Intel Target ‘Big Data Ecosystem’
Cloudera, Dell and Intel said they plan to launch dedicated Dell In-Memory Appliances for the Cloudera Enterprise as a way to boost real-time analytics performance.
The partners said the goal of the engineering partnership is to advance enterprise deployments of Apache Hadoop as well as to integrate hardware and software to help enterprises leverage advanced data analysis. The latter would be enabled through incorporation of real-time data streams into customers’ applications.
The partners said the initial effort would be the first in a family of appliances they will collaborate to develop known as “Dell Engineered Systems for Cloudera Enterprise.”
The partners contend that datacenters are becoming too complex for any one vendor to supply and they intend to deliver an integrated solution. Separately, Dell was recently designated a “visionary” in market research Gartner’s rankings of integrated computing, storage and networking companies.
Based on its collaboration with Intel and Cloudera, Dell claims its Hadoop-based enterprise appliance, or “big data solution stack” could allow some applications to run as much as 100 times faster than Hadoop MapReduce in-memory and as much as 10 times faster on disk. The increased performance would allow customers to “incorporate real-time data streams into their applications,” Dell CTO Sam Greenblatt said in a statement.
“Traditional systems isolate data and create silos that make exploration and collaboration hard,” added Mike Olson, Cloudera’s co-founder and chief strategy officer. “By putting that data into a modern enterprise data hub, customers can combine it and analyze it….”
The Dell In-Memory Appliances are based on the Cloudera Enterprise data hub that uses an open source version of Apache Hadoop. The appliances run on Dell R920 hardware that is based on Intel’s Xeon processor, the partners said. Cloudera Enterprise includes Apache Spark, a real-time analytics component that complements Hadoop.
The combination is said to make it easier to develop “unified big data applications combining batch, streaming and interactive analytics….” It is also being touted as compatible with existing datacenter infrastructure.
The partners said the in-memory appliances would be available in pre-configured versions that can be deployed based on customers’ applications.
For its part, processor maker Intel said it is continuing to invest in machine learning and graph analytics frameworks based on the open source Apache Hadoop platform, including Apache Spark. It is also working to optimize the platform to run on Xeon processors.
Together, the partners said they are attempting to build a “big data ecosystem” that combines data analytics hardware and software to move advanced data analytics to mainstream applications.
Beginning last year, Cloudera has been seeking to recast Hadoop as a centralized “data hub” for enterprises. The hub would serve as “one place to store and work with all data.”