Hadoop Hits Primetime with Production Release
This week the Apache Software Foundation announced that Hadoop would be moved into the full product 1.0 release, an effort that has been in the works for over six years. According to the foundation, the platform has reached the level of stability needed to make it enterprise ready.
According to Arun C. Murthy, VP of the Apache Hadoop project, “this release is the culmination of a lot of hard work and cooperation from a vibrant Apache community group…Hadoop is becoming the de facto data platform that enables organizations to store, process and query vast torrents of data, and the new release represents an important step forward in performance, stability and security.”
This “production ready” version has received the final nod from the large set of users, data gurus, system engineers and others and includes support for a number of core features (many of which are also in full production around the world). Among the certified features that are supported in the new version are HBase (synch and flush support for logging transactions); beefed up security ala Kerberos; a RESTful API to tap into DHFS; “performance enhanced” access to local files for HBase; and a boatload of other minor fixes and features, all of which can be viewed at the main site.
To highlight the production-readiness of the platform’s 1.0 release, Apache pointed to the large number of successful case studies that have been running Hadoop in production, some for more than a couple of years already. These include household-name web giants like AOL, Facebook, HP, LinkedIn, Netflix, Twitter and The New York Times.
While these user stories are compelling enough, one of the more interesting developments over the last year in particular has been the wide adoption and integration of Hadoop across a vast swath of the IT industry. Hardware and software companies alike have scrambled to make connectors of all varieties in order to persuade customers that their integration can add ease of use to the notoriously complex platform.
The real story that will unfold over the next several months is hard to foretell. On the one hand, one might suggest this will give companies offering supported, tricked-out versions of Hadoop (Cloudera, for instance) and those with their own similar platforms (HPCC Systems comes to mind) a run for their money.
But then again, most folks considering adopting Hadoop are already aware of the fact that it’s already being used in a large number of production environments. If viewed that way, the announcement of a stable version for enterprise action is more of a ceremony than something that is really going to change the minds of the masses.
This is certainly one of the most interesting efforts in the big data space, and a year of intense wrangling among the rather robust ecosystem that has developed around Hadoop lies in wait.