Follow Datanami:
February 3, 2014

Cloudera Shuffles Its Product Deck in Pursuit of ‘Data Hub’ Strategy

Alex Woodie

Cloudera today unveiled a new three-tiered product packaging strategy for its Hadoop software, including a new high-end “Data Hub” Edition designed to help it compete against the likes of IBM and Pivotal. The company also announced the availability of the Spark stream processing and machine learning engine.

Ever since Cloudera unveiled its Enterprise Data Hub strategy at the Hadoop World + Strata conference in New York City last fall, the company has been trying to separate itself from the rest of the Hadoop pack and position itself as the most enterprise-ready Hadoop distribution to do battle against the likes of the megavendors like IBM and Pivotal (spun out of EMC) that are tiptoeing around the edge of Hadoop’s encampment.

The company took a big step toward fulfilling that vision today with a fairly major reshuffling of its products. Previously, the company offered a single paid product dubbed Cloudera Enterprise, and customers would select and buy additional engines, such as Impala, as needed on an individual basis.

That piece-meal approach is gone now, and is replaced by three products that live under the Cloudera Enterprise umbrella. At the low end is the Basic Edition, which includes the core Hadoop components (HDFS and MapReduce), along with the Cloudera Manager software. One step up from that is the Flex Edition, which includes core Hadoop, Manager, and allows customers to choose one premium engine, such as Impala, HBase, Search, or Spark, per cluster. At the high end is the Data Hub Edition, which includes all of the premium engines.

Cloudera made the change after analyzing the product use patterns of its customers and determining this new product packaging approach most closely matched those uses, says Clarke Patterson, senior director of product marketing for Cloudera. “Rather than taking a nickel and dime approach, like we have in the past, we’re segmenting everything much more along with how we see organizations consuming Hadoop based on their maturity,” Patterson tells Datanami.

The first group of Hadoop users are only using it for storage and processing, which only requires HDFS and MapReduce. They would use the Basic Edition The second group of users is a little more involved with Hadoop and may start trying to explore their data in new ways, in which case they would use one of the premium data engines. The third group is moving forward aggressively with Hadoop and are looking to integrate it into their overarching enterprise data strategy, Patterson says.

Closing a deal with a customer from the third group was not as clean as it could be. “When a customer says, ‘This all sounds beautiful to me. What do I buy?’ We’d say, ‘It’s this plus this plus this plus this.’ It starts to get messy in a hurry,” Patterson says.

Keeping things neat and tidy will help Cloudera go up against the likes of IBM and Pivotal, which are adding Hadoop to existing product stacks. “In a roundabout way, they’re kind of painting a data hub vision with multiple products, whereas our value proposition is we have a single product, where all your data sits in one place,” Patterson says. “You bring your compute to the data, run multiple workloads, secure, govern it–all those other things, within a single offering.”

Cloudera says customers won’t pay more for the same functionality under the new product scheme. However, it refused to provide pricing information to confirm that.

In any event, the move away from an a la carte menu to one that is more granular will definitely encourage customers to spend more with Cloudera. For example, a customer who wanted only two of the premium engines–say, Impala and search–will be forced to move up from the Flex Edition (which only includes one premium engine) and buy the Data Hub Edition, which includes all of the premium engines.

Patterson says Cloudera is consciously encouraging users to do more with their Hadoop implementations. “A lot of it comes back to misperceptions of what the technology is capable of, quite frankly,” he says. “I don’t think there’s a lot of organizations that realize the breadth of capabilities that exist. [People say] there’s a lot of hype, none of it is really real.’ We have an offering today that is enterprise-ready as people begin to do very meaningful things.”

Cloudera will continue to provide a free version of its Cloudera Distribution for Hadoop (CDH) software under open source license. But instead of calling it Cloudera Standard, the company changed the name of the free version to Cloudera Express. If it all reminds you of the way that bigger commercial open source software vendors package and sell their products, it’s because Cloudera is doing exactly that.

In other news, Cloudera announced that it’s now including Apache Spark in its premium Hadoop offerings. Last fall, the company partnered with Databricks, the company behind the in-memory Spark stream processing engine and an associated library of machine learning algorithms. Spark is an in-memory engine written in Scala that’s viewed as a much-speedier alternative to MapReduce. Databricks raised $14 million last year, and released Spark 0.9 today. Spark is available for CDH 4.4 and newer releases.

Related Items:

Cloudera Touts Near Linear Scalability with Impala

Reaping the Fruits of Hadoop Labor in 2014

Cloudera Articulates a ‘Data Hub’ Future for Hadoop