Cloudera Commits to 100% Open Source
The old Cloudera developed and distributed its Hadoop stack using a mix of open source and proprietary methods and licenses. But the new Cloudera will be 100% open source, just like Hortonworks, its one-time Hadoop rival that it acquired in January. But will developing its data platform completely in the open differentiate it from cloud competitors?
In a blog post published yesterday under the title “Our Commitment to Open Source Software,” Cloudera executives Charles Zedlewski and Arun Murthy laid out the company’s new plan to develop and distribute everything in the open.
“Prior to the merger, the two companies distributed their products under somewhat different open source licensing models,” Zedlewski and Murthy wrote. “Aligning the two models was one of the last items on our merger to-do list.”
Most of Cloudera‘s software already was open source. The core Apache Hadoop project, for example, has always been developed in an open source manner, and key supporting pieces, like Apache Hive, Apache Spark, and Apache HBase, are developed and distributed through open source projects of their own.
But there were some key products in the Cloudera stack — specifically Cloudera Manager, Cloudera Navigator, and Cloudera Data Science Workbench — that have always been closed source. Over the next six months, Cloudera will transition all of its products, including previously proprietary ones, into open source.
The software company, which is publically traded on the New York Stock Exchange under the ticker symbol CLDR, is emulating the successful commercial open source business model of Red Hat, which will soon be part of IBM.
“The subscription agreement will cover the terms of support and maintenance, as well as access to the latest updates and security patches,” the Cloudera executives wrote. “In this way, we will align Cloudera’s open source strategy as closely as possible with the market leading open source strategy developed by Red Hat and accepted globally by thousands of businesses.”
Cloudera will distribute its software using one of two open source licenses: the Apache License, Version 2 and the GNU Affero General Public License, Version 3 (AGPL). The previously closed products (like Cloudera Navigator) will be distributed under the AGPL. (The company had considered adopting a modified license, but nixed it in favor of licenses already accepted by the community, it said.)
Customers and developers will need subscription agreements to access Cloudera software. Subscription agreements for developers will allow for free downloads of the software, and will include support. Cloudera will also offer subscription agreements that allow customers to use the products for free for a short period of time.
The company will roll out its new subscription agreement starting in September 2019, with a plan have all of its projects transitioned to open source by February 2020.
The company said the new licensing model will be fully implemented with the forthcoming launch of Cloudera Data Platform (CDP), which includes the hosted CDP Private Cloud version, as well as CDP Data Center, an on-premises, bare-metal version of CDP.
All of the projects that are currently developed by the Apache Software Foundation will continue under the ASF, the company announced. While it won’t move any projects away from the ASF, the company left open the possibility that it will look to other organizations for new projects.
“We are committed to 100% open source,” the company says in a handy FAQ. “Open source conveys real strategic benefits to Cloudera and our customers. The community innovates more broadly than any single company can. Open source creates standards and makes them easy for customers to adopt. It empowers developers to build on the platform, by exposing its implementation. And it insulates customers from lock-in and bad vendor behavior.”
The company is clearly hoping that being open and supporting open source software development and distribution methods will provide some competitive advantage against the public cloud vendors, which are poaching disappointed Hadoop customers.
Cloudera cited “freedom from vendor lock-in” as one of the goals of the open source commitment. “Customers are entrusting their most valuable asset (their data) to our data management platform. They want to pay their platform vendor for added value, not out of fear of the cost of switching,” Zedlewski and Murthy wrote.
As public cloud providers gobble up customers’ data and workloads, vendor lock-in has become a bigger concern for customers. And Amazon Web Services, in particular, has been criticized for abusing the trust of the open source community by using open source projects to build data services that it turns around and sells to customers.
Open source NoSQL database vendors MongoDB, Redis Labs, and Elasticsearch, in addition to Apache Kafka-backer Confluent, all have pushed back against AWS for perceived abuses of open source licenses. But Hadoop platforms, so far, have been immune to cloud encroachment.
According to Zedlewski and Murthy, open source has been good to Cloudera, and will continue to be core to its strategy going forward.
“Cloudera has developed many open source projects that have gone on to become industry standards, but no one company can be the sole source of innovation,” they wrote. “By investing in open source projects such as Spark, Kubernetes and Kafka, we keep our customers on a sustainable long-term architecture vs. pulling them onto an island of Cloudera-only developed tools.”
Open source has been a core component of Cloudera’s business strategy from the get-go, and it’s clearly not moving away from it. In fact, it’s putting all of its eggs in the open source basket, for better or for worse. At this stage in the company’s evolution — following the departure of its CEO last month and the steep decline in its stock valuation — you can’t accuse the company of not acting decisively.
Editor’s note: This story was corrected. Cloudera will offer free downloads of its products on a short-term basis, not in perpetuity. Datanami regrets the error.