September 25, 2017

Data Plane Offering Marks New Phase for Hortonworks

Alex Woodie

(Evgeniy Belyaev/Shutterstock)

The new Data Plane Service (DPS) that Hortonworks unveiled today at the Strata Data Conference marks the start of a new phase for the publicly traded company, according to Arun Murthy, chief product officer and co-founder of the company.

If Hortonworks’s Distribution of Apache Hadoop (HDP) was the company’s first stage and its Apache NiFi-based stream processing system, dubbed Hortonworks Data Flow (HDF), was the second stage, then the launch of DPS marks a new stage, according to Murthy.

“It’s the third leg of the stool, but it’s a service rather than a product,” he tells Datanami. “You go to our service, then you point our service to your data assets and workloads, and we can start to manage it.”

The new Web-based DPS offering, ostensibly, provides two main capabilities. First, it lets customers manage the security and governance of data wherever it might be. That could be in a Hadoop cluster, a stream processing system, or an enterprise data warehouse (EDW) sitting on premise, in the public cloud, or a hybrid mixture of both.

Any product that integrates with the APIs for Apache Ranger and Apache Atlas can be managed via DPS. “Anything that can talk to Atlas and Ranger, we can go access and service and configure,” Murthy says. “As long as they can leverage open source standards for security and governance or metadata management — Atlas, Ranger, and so on — we can access that and give you really great services on top of them.”

Secondly, DPS allows users to dynamically spin up cloud or on-premise clusters to process the data that’s managed via the Ranger and Atlas API hooks. The DPS offering includes built-in capabilities for common tasks, such as securely uploading data from on-premise to the cloud or moving data from one cloud to another. DPS gets its workload deployment capabilities via hooks into the Cloudbreak offering that Hortonworks obtained with its acquisition of SequenceIQ two years ago.

“Obviously it ties HDF and HDP together, but it’s certainly our intent to go beyond HDP and HDF,” Murthy says. “Data Plane is a cloud based services that delivers an extensible platform to reliably manage data and workloads…spin up clusters, and apply consistent security and governance policies across streaming data, tabular structured data, unstructured data and so on, regardless of where the data resides.”

DPS is not a product that you install, Murthy says, but rather a Web-based app store where customers can select different data management and processing items from a la carte menu. “We want this to be extensible to more than just Hortonworks products,” he says. “We want you to be able to plug in multiple sources of product – it may be data in S3 or data in Azure or data in your enterprise data warehouse.”

While it hasn’t had a data plane product, per se, until now, the company has been working on data plane concepts for the past 24 months, Murthy says. At the vendor’s Hadoop Summit in June 2016, Hortonworks executives laid out a clear vision for a federated data plane that allows customers to manage data and workloads wherever they might reside — in the data center to the edge of IoT — with Atlas, Ranger, and Ambari providing the integration points.

This federated data plane idea goes somewhat against the core concept in Hadoop, which is that data should be centralized in one giant repository, and various applications are then brought to it for processing (i.e. “bring the compute to the data” rather than vice versa). Instead of physically storing all your data in one place, as many Hadoop vendors have recommended, with a federated data plane, you leave your various datasets where they want to sit and then connect them through logical views. The management layer on top of that is often called a data fabric.

“So instead of having one data lake to rule them all, it seems like enterprise need to manage multiple ponds and lakes and oceans of data,” Murthy says. “I love how Forrester talks about this notion of a data fabric, and if you like, the data plane is an instantiation of that concept.”

Forrester analyst Noel Yuhanna has been instrumental in defining data fabrics for the industry. Earlier this year, Yuhanna wrote a report on data fabrics that describe how they essentially combine a disparate collection of technologies to address key pain points in big data projects — such as data access, discovery, transformation, integration, security, governance, lineage, and orchestration — in a cohesive and self-service manner.

“The solution must be able to process and curate large amounts of structured, semi-structured, and unstructured data stored in big data platforms such as Apache Hadoop, MPP EDWs, NoSQL, Apache Spark, in-memory technologies, and other related commercial and open source platforms, including Apache projects,” Yuhanna wrote. “In addition, it must leverage big data technologies such as Spark, Hadoop, and in-memory as a compute and storage layer to assist the big data fabric with aggregation, transformation, and curation processing.”

For Hortonworks, the DPS service represents the start of a new service that gives customers the capability to manage data wherever it might be. The DPS service will expose “pluggable interfaces” that let Hortonworks and its ecosystem of partners offer a variety of data management services.

“What we’re hearing from enterprises is, we’ve been growing a lot of tooling and a lot of key technology,” Murthy says. “But I think what’s missing in the market is almost a fabric, if you will, over all this.”

Hortonworks Shares Vision of Connected Data Planes

Converged Platform or Federated Data Plane? The Debate Heats Up

Technologies: Cloud, Frameworks, Middleware, Network, Processors, Storage, Systems

Sectors: Financial Services, Healthcare, Retail

Vendors: Forrrester, Hortonworks

Tags: apache atlas, apache knox, data fabric, data lake, data plane, Hadoop, Hortonworks, NiFi

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Data Plane Offering Marks New Phase for Hortonworks

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

April 17, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Data Plane Offering Marks New Phase for Hortonworks

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

April 17, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link