Follow Datanami:
September 10, 2018

‘Open Hybrid’ Initiative Targets Big Data Workloads

Hortonworks, IBM and Red Hat today announced they’re banding together to build a consistent hybrid computing architecture for big data workloads. Dubbed the Open Hybrid Architecture Initiative, the program pledges simplicity of deployment and freedom of movement for data apps.

The rapid ascent of cloud computing platforms like AWS, Azure, and Google Cloud has given enterprises abundant new options for storing data and deploying processing-intensive applications, such as deep learning and real-time stream processing. Throw in the progress being made at the edge, with sensors and speedy ARM chips collecting and processing massive amounts of data, and you have the makings of a computing revolution.

While the computing possibilities in the cloud and on the edge may appear bountiful, the reality is that the underlying architectures for building apps that can span these three modes are just starting to come together. Enterprises today face a dearth of repeatable patterns to guide their developers, administrators, and architects, who are tasked with building, deploying and maintaining hybrid that span not just the cloud and the edge, but traditional on-prem data centers too.

That’s the underlying challenge that’s to be faced by the Open Hybrid Architecture Initiative. Launched today in advanced of the Strata Data Conference in New York City, the group outlined plans to integrate their respective technologies in such a way as to provide customers with greater freedom of movement and run-time options for their big data workloads.

Hortonworks is counting on its DataPlane Service to streamline hybrid deployments of its products

In terms of actual deliverables, the initial phase of the initiative calls for the companies to integrate various products. Specifically, Hortonworks Data Platform (HDP), Hortonworks DataFlow (HDF), Hortonworks DataPlane (DPS) and IBM Cloud Private for Data will be optimized to run on Red Hat OpenShift, the company’s distribution of Kubernetes for containerized applications.

The companies say the move will “provide the vast OpenShift community of developers and users – which include IBM and Hortonworks clients – fast access to robust analytics, data science, machine learning, data management and governance capabilities, fully supported across hybrid clouds.”

Enterprises will be able to access and process data no matter where it resides as part of the Open Hybrid Architecture Initiative, says Hortonworks co-founder and CTO Arun Murthy.

“Through the initiative, we deliver an architecture where it absolutely will not matter where your data is – in any cloud, on-prem or the edge – enterprises can leverage open-source analytics in a secure and governed manner,” he says in a blog post today. “The benefits of ensuring a consistent interaction model cannot be overstated and provides the key to unlocking a seamless experience.”

Kubernetes stands to play a starring role in the Open Hybrid Architecture Initiative. The open source container management software serves as a virtualization layer that decouples runtime environments from underlying hardware, while providing the capability to spin up, spin down, and move software applications at the administrator’s will.

“By building and managing their applications via containers and Kubernetes with OpenShift,” says Ashesh Badani, vice president and general manager of Cloud Platforms at Red Hat, “customers and the big data ecosystem have opportunities to bring this next generation of big data workloads to the hybrid cloud and deliver the benefits of an agile, efficient, reliable, multi-cloud infrastructure.”

IBM is currently working on achieving “primed” status for running IBM Cloud Private for Data on OpenShift, which is expected to occur later this month. “Scaling the ladder to AI demands robust data prep, analytics, data science and governance, all of which are easily scaled and streamlined in the kind of containerized, Kubernetes-orchestrated environments that we’re talking about today,” says Rob Thomas, general manager of IBM Analytics.

Hortonworks is following a similar path with its products, including DPS, which it launched a year ago and which will be called upon for spinning various engines like Hive and Spark up and down in a hybrid architecture while maintaining necessary controls that enterprises demand. “This allows customers to more easily adopt a hybrid architecture for big data applications and analytics, all with the common and trusted security, data governance and operations that enterprises require,” Murthy says.

It’s not clear if any cloud providers are working with the Open Hybrid Architecture Initiative at this point, or if there are any other members of the group. There is no website set up yet for the Open Hybrid Architecture Initiative, but a spokesperson for Hortonworks says there will be one soon.

The Hortonworks spokesperson says cloud platform vendors are welcome to join the group.  “We welcome participation from anyone who wants to collaborate to accelerate hybrid for customers,” the spokesperson says.

In any event, it’s not all about moving applications out of the data center and into the cloud. According to a recent IDC survey, more than 80% of respondents said they plan to move or repatriate data and workloads from public cloud environments behind the firewall to hosted private clouds or on-premises locations over the next year.

The reason for this “application repatriation” is that the “initial expectations of a single public cloud provider were not realized,” the group says.

Related Items:

Hortonworks Looks to Expand Hybrid Cloud Footprints

Data Plane Offering Marks New Phase for Hortonworks

Big Data Fabrics Emerge to Ease Hadoop Pain