Big data may be the killer app for clouds, said Red Hat today, announcing their intention to seamlessly shepherd their considerable installed user base into a new era of computing.
In the announcement, Red Hat revealed a big data strategy that points directly at open hybrid clouds. The goal is to enable enterprises to create big data workloads in a public cloud environment (using tools that they’re already familiar with), and move those workloads seamlessly into a private cloud (and back again, if needed) without having to retool their applications.
As part of this direction, Red Hat has announced that later this year it will be moving the Red Hat Storage Hadoop plug-in to the Apache Hadoop open source community with the goal of making it a fully-supported, Hadoop-compatible file system for big data environments.
“Big data could be one of the killer apps for open hybrid cloud,” said Ranga Rangachari, VP and GM for the Red Hat storage business unit in a press call earlier today, explaining their cloud-centric big data plan. “Anything we do, specifically on big data, on the infrastructure as well as the application side is all around helping customers run their applications either on premise, in a hybrid environment, or in a public cloud environment. The open hybrid cloud team weaves through every one of the projects that we do, specifically big data.”
In order to accomplish cross platform openness, Red Hat has chosen to pivot the Red Hat Storage Hadoop plug-in focus towards the Apache community, where previously development was exclusive to the Gluster community.
“The center of gravity around Hadoop is happening in the Apache community,” commented Rangachari in explaining the pivot. “We felt that the best way to foster innovation, to continue to keep the Apache Hadoop community innovating on a forward-going basis is to provide the developers with the access to the plug-in all from the same source. This doesn’t mean that [the Gluster community] stops innovating, but we felt that where it can maximize innovation is in the Apache Hadoop community.”
Rangachari says that most of the ‘proof of concept’ big data implementations that Red Hat sees from its customers are run in a public cloud for the simple reason that they don’t have the resources on premises to build out many node cluster (i.e., 20 – 50). Once it reaches scale they often want to bring it back in-house into a private cloud environment, but find that they’re faced with re-tooling (that may, or may not work as expected).
Red Hat is clear to make the case that this strategy is a potential boon for big data because they are largely already “in” big data with clear momentum. Citing a report by the Linux Foundation, Red Hat pointed out that nearly three-quarters of all big data workloads run on top of Linux. Red Hat also says that 67% of machines running on AWS are running on Linux, citing a study conducted by AWS analyst service, newvem.
Red Hat also makes the point that their developer applications are already big data ready, referencing their application platform which is already well known and doesn't require hiking a steep learning curve.