February 12, 2015

Project Myriad Brings Hadoop Closer to Mesos

Alex Woodie

One of the challenges of running Hadoop is resource management. The process of spinning up and managing hundreds, if not tens of thousands, of server nodes in a Hadoop cluster—and spinning them down and moving them, etc.–is way too hard to do manually. Automation must come to the table to help Hadoop take the next step forward in its evolution. The big question is how it will unfold.

One answer to that question came to the forefront yesterday when a group of companies led by MapR Technologies and Mesosphere unveiled Myriad, a new project that aims to enable Hadoop jobs running under Apache YARN to be managed using the Mesos resource manager for data centers.

The goal of the Myriad project—which is being hosted on GitHub and counts EBay and Twitter as contributors–is to deliver code that effectively unites Apache YARN and Apache Mesos, the data center operating system (DCOS) and resource manager that was developed at Cal Berkeley’s AMPLab and today is managed by Mesosphere. According to Mesosphere, DCOS is a new kind of operating system that organizes all of ones machines, virtual machines, and cloud instances into a single pool of shared resources. It runs atop Linux and is already used in production by Twitter, Netflix and Airbnb.

With Myriad, any YARN-compatible Hadoop jobs–such as Spark, MapReduce, Pig, or Hive–will be able to run on the same hardware as other non-Hadoop applications. This could be anything from streaming applications like Storm or Kafka, management tools for developers like Jenkins, HPC jobs running under MPI, and regular old Web server workloads.

Myriad brings together both major resource managers for Hadoop and other important apps, says Florian Leibert, CEO and co-founder of Mesosphere. “Big data developers no longer have to choose between YARN and Mesos for managing clusters,” Leibert says in a press release. “Myriad allows you to run both, and to run all of your big data workloads and distributed applications and systems on a single pool of resources.”

The Mesos DCOS, according to Mesosphere.

Jack Norris, chief marketing officer for MapR, says Myriad delivers the tools that Hadoop users are asking for in the area of cluster management. “One of the motivations with EBay and Twitter is that they have extensive use of Hadoop and they have Web servers that are provisioned for peak and have long periods of low utilization,” he tells Datanami. “The ability to use Myriad to fill those long periods of low utilization with some Hadoop workload and then pull those off in anticipation of peak demand, that provides additional efficiency within the data center.”

Obviously, not every Hadoop user is going to be so heavily invested in data center hardware that they need Myriad to unite operational and analytic workloads on the same cluster. Companies like Twitter and EBay have hundreds of thousands of nodes to manage, if not millions, so even small percentage gains in efficiency translate into lots of dollars saved. A similar dynamic is in effect at HPC sites, where organizations want to start incorporating big data analytic technologies like Hadoop and Spark, but are loathe to dedicate their entire supercomputer to such tasks.

But in the long run, the benefit that comes from provisioning the same hardware in multiple different ways is clearly in the cards for all types of applications, and will help at any scale. The burgeoning diversity of big data applications—from the Hadoop/YARN family and NoSQL databases to in-memory data grids and real-time streaming systems–will benefit from having more fluid and flexible methods of deployment of the sort that Myriad aims to deliver.

“Combining Mesos, which is really good at resource management in general and has pretty good low-level granularity in terms of disk and network and CPU and memory, with YARN, which is really good at managing Hadoop resources but lacks some of that granular control and doesn’t really work across non-Hadoop resources—gives you have this dynamic capability to configure YARN and have virtual Hadoop clusters in the data center,” Norris adds. “And it’s a completely open-source project that will work across different Hadoop distributions. It’s not limited to MapR.”

Concurrently, the folks behind Project Myriad plan to submit it as an Apache Incubator project with the Apache Software Foundation by the end of the first quarter of 2015.

Related Items:

Rethinking Hadoop for HPC

Apache Flink Takes Its Own Route to Distributed Data Processing

Why Kafka Should Run Natively on Hadoop

Applications: Complex Event Processing

Technologies: Frameworks, Systems

Sectors: Retail

Vendors: EBay, MapR Techologies, Mesopshere, Twitter

Tags: Hadoop, mapr, Mesos, Myriad

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Project Myriad Brings Hadoop Closer to Mesos

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Project Myriad Brings Hadoop Closer to Mesos

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link