January 14, 2014

Google Bypasses HDFS with New Cloud Storage Option

Alex Woodie

Google Hadoop customers can now run MapReduce jobs directly against data stored in the Google Cloud Storage and leave HDFS out of the big data equation as a result of a new cloud storage Hadoop connector the Web giant unveiled today.

There are many reasons why you might want to bypass the Hadoop Distributed File Systems (HDFS) when running Hadoop jobs on Google Compute Engine slices running in the Google cloud. For starters, using Google’s object storage lets you focus on data processing logic instead of on managing a cluster and file system, according to Google.

Using the Google Cloud Storage instead of HDFS means no more running file system checks, rebalancing, upgrades, rollbacks, and NameNode restarts, the company says. “Google Cloud Storage just works,” the company says. “Your data is safe and consistent with no extra effort.”

Getting started is faster too, since you don’t have to wait while the Google Compute Engine to copy data to the HDFS and for the NameNode to come out of safe mode, according to Google, which unveiled the new connector on its Google Cloud Platform blog.

The data is also safer on the Google Cloud Storage, the company says, because it’s globally replicated. Because of this, there’s no need to pay for additional backup (as you would if storing data in HDFS on Google Compute Engine VMs), further reducing costs.

Storing the data separately from HDFS also keeps it separate from the Hadoop compute nodes and the NameNodes. (In fact, users don’t even need the NameNode when using Google’s object store.) This provides higher data resiliency, because if the Google Compute Engine VMs on which the Hadoop cluster live are turned off or crashed, the data is gone, the company says.

Google Cloud Storage, called Colossus, is a RESTful service for storing and accessing data objects stored on Google’s infrastructure. Changing a Google Hadoop instance to use Colossus is a simple matter of changing the URL to point to the object store instead of HDFS. Google also gives customers the option to store data in both HDFS and its Cloud Storage platform, whereby users access it using a different file path.

Can Google Harness Big Data to Ward Off Death?

Thinking in 10x and Other Google Directives

Applications: Enterprise Analytics

Technologies: Cloud

Sectors: Other

Vendors: Google Cloud

Tags: cloud, Hadoop, mapreduce

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

April 19, 2024

April 18, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Building an Operational Data Warehouse for Real-time Analytics

Can You Use Kafka as a Database?

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

Call & Contact Center Expo

AI & Big Data Expo North America 2024

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Google Bypasses HDFS with New Cloud Storage Option

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 19, 2024

April 18, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In