Follow Datanami:
June 3, 2014

Hortonworks Spins Up a YARN Readiness Program

Hortonworks today launched its YARN Ready Program to help Hadoop application vendors adopt the technology that’s at the heart of the modern Hadoop v2 infrastructure.

YARN is the key piece of technology that enables multiple data engines to run simultaneously on the same Hadoop cluster, and is the centerpiece of Hadoop v2, which launched last October. Most Hadoop v1 applications were written using MapReduce, but YARN and Hadoop 2 bring new opportunities for building integrated data-driven applications.

There’s still a lot of legacy Hadoop v1 code out there, but customers are beginning to get on the YARN bandwagon, says John Kreisa, VP of strategic marketing for Hortonworks.

“We’ve seen a continue interest both with software vendors wanting to integrate their products on YARN, and the enterprises who want to find YARN-integrated products so they can really realize the benefit of Hadoop 2 with multiplied workloads running on single Hadoop instance,” he tells Datanami.

“I wouldn’t say we have achieved anywhere near 100 percent awareness for the technology,” he continues. “But in the more mature Hadoop implementations, they’re asking for it. They all know Hadoop 2 is the enterprise-ready version of Hadoop. They know that this is the blueprint for pulling Hadoop into the enterprise, and they know YARN is the architectural center that’s driving the broader use of Hadoop, and so they’re starting to look for it and ask for it by name.”

Hortonworks is eager to sign application vendors up for its YARN Ready Program, which is an extension of its Partner Certification Program. As part of the program, which does not cost software vendors anything, Hortonworks will provide technical assistance and guide the users through the process of YARN-enabling their Hadoop applications, whether they are existing apps or new ones.

There are four ways to integrate with YARN, Kreisa says, and each path brings its own costs and benefits. Developers can go “YARN native” and use the YARN API directly in their apps. This gives the developer the most amount of control, but writing this close to Hadoop in a “machine language” style is not for the faint of heart.

Developers can also utilize good old MapReduce as a path to working with YARN and getting the co-habitation benefits that it brings. Hadoop v1 MapReduce code is about 98 percent compatible with Hadoop v2 and YARN, requiring just a bit of work to bring it over.

Working with Apache Tez is the third way to get to YARN. Hortonworks is bullish on the prospects of Tez to replace MapReduce as a data-processing in Hadoop 2. Tez, which came out of the Stinger project that Hortonworks spearheaded to speed SQL processing in Hive, makes a particularly good interface for running query tools against HDFS data, Kreisa says.

Finally, there’s Apache Slider, which provides a good interface for real-time communication, especially with database apps running on Hadoop. “Slider’s designed for real-time integration with the cluster, so database-style workloads where a very high-speed transfer or data processing and response times are required,” Kreisa says.

Hortonworks unveiled the YARN Ready program today at the Hadoop Summit, which it’s hosting in San Jose, California. The Hadoop distributor has already signed up several prominent Hadoop application vendors to the program, including Concurrent, DataTorrent, IBM, Teradata, Splunk, MicroStrategy, and Actian.

“Being YARN Ready ensures our clients that analytics execute in a cooperative manner with other workloads running in the cluster,” says Michael Hiskey, vice president of product marketing at MicroStrategy.

Henry Sohn, vice president of business operations at DataTorrent (which today is announcing general availability of its Real Time Streaming application for Hadoop) said: “YARN Ready certification provides our customers with the knowledge that they’ll be able to scalably ingest data from any source, process it at sub-second speeds and take action in real-time.”

Hortonworks isn’t the only Hadoop distributor with a certification program. But it appears to be the only one targeting YARN. There’s no guarantee that apps that have been certified under Hortonworks’ program will be 100 percent out-of-the-box compatible with Hadoop from other distributors, Kreisa says.

Each Hadoop vendor’s distribution is their own, but there’s nothing stopping Cloudera, MapR Technologies, and others from launching their own programs to beat the YARN drum in a similar fashion.

Related Items:

Hortonworks Keen on Cascading-Tez Combo

The Future of Hadoop Runs on Tez, Hortonworks Says

Hadoop Version 2: One Step Closer to the Big Data Goal

Datanami