March 29, 2016

Deploying Hadoop on User Namespace Containers

Abin Shahab

(2M media/Shutterstock)

Hadoop is increasingly moving to the cloud, with the Gartner group reporting that over 50% of companies are considering a cloud only or hybrid cloud solution for Big Data. Altiscale has been offering a high-performance, secure, multi-tenant cloud solution since 2014, with its multitenancy and performance capabilities driven by the use of namespaced Docker containers.

In my Thursday session, titled “Deploying Hadoop on user namespace containers,” I will explain my years of work in making Hadoop run effectively in the cloud.

Docker Containers

Docker is a very popular container technology. A Docker container provides an isolated virtual machine-like environment. Docker containers are similar to lightweight virtual machines (VMs), but they provide better performance than VMs, resulting in performance levels achieved by bare metal. I’ll explain how to treat a container like a VM or machine and how to expand its capabilities to achieve all that a machine can do.

Docker and Elastic Scaling

At Altiscale, Hadoop is deployed on our data centers in a way that allows customers to process petabytes of data without worrying about Hadoop cluster management. Altiscale clusters grow and shrink elastically to keep pace with the customer’s compute and storage needs.

This elasticity is achieved by growing and shrinking the slave nodes. Docker containers enable the launch of NodeManagers and DataNodes in subseconds in order to respond rapidly to shifting customer demands. Altiscale achieves greater isolation than what Docker provides by applying our user-namespace solution on top of Docker, so that no user inside these Hadoop slaves has root privileges. I’ll describe the Altiscale elastic cluster model, the design decisions behind it, and the issues he encountered and addressed.

Future Development Direction

I’ll also cover future developments in this area that help improve isolation and elasticity, such as nested containers, allowing Hadoop users to launch their own containers. The session is Thursday from 11:00 to 11:40 a.m. in room 230 C. For more information see the session description.

About the author: Abin Shahab is a Senior Software Engineer at Altiscale and a contributor to Hadoop, Docker, and LXC. Prior to joining Altiscale, Abin worked on graph databases and search engines at Guidewire, Symantec, and Vivisimo (IBM). Abin holds a Masters degree in Software Engineering from Carnegie Mellon University and a Bachelors in Computer Science from University of Arizona.

Applications: Enterprise Analytics

Technologies: Frameworks, Middleware, Processors

Sectors: Healthcare, Manufacturing, Retail

Tags: altiscale, containers, Docker, Hadoop, virtualization

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Deploying Hadoop on User Namespace Containers

Docker Containers

Docker and Elastic Scaling

Future Development Direction

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 24, 2024

April 23, 2024

April 22, 2024

April 19, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Deploying Hadoop on User Namespace Containers

Docker Containers

Docker and Elastic Scaling

Future Development Direction

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 24, 2024

April 23, 2024

April 22, 2024

April 19, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link