Follow Datanami:
July 10, 2012

Sears Startup Spins Managed Hadoop

Datanami Staff

One might not think to shop Sears for the latest “Big Data as a Service” offering, but the retailer’s parent company, Sears Holding Corporation, is throwing some of its seasoned data and infrastructure pros behind such an effort.

MetaScale’s Founder and CEO Dr. Phil Shelley also serves as CTO at Sears Holding Corporation where he’s been getting his hands dirty with everything from moving the company off the mainframe barge to implementing private clouds that tap into open source cloud tools.

Most recently, Shelley has been part of the ever-growing vocal camp making the conference rounds to talk big data and Hadoop for clouds. His partner and head of technology, Scott LaCrosse is another Sears Holdings carryover who worked with the company’s database and data warehouse end of things.

One of the prime offerings the team has put together under the MetaScale branch of Sears Holdings is a managed Hadoop service that is designed, like other hosted solutions, to let users chuck the time and hassle of building Hadoop clusters to manage on their own.

The company’s services include building and configuring the Hadoop cluster; operating, managing and monitoring the cluster; integrating with other data management technologies; running data-cleansing and other data management projects; writing and developing MapReduce applications along with custom solution design and system integration.

In a recent strategic move, MetaScale joined Cloudera’s partner ecosystem, claiming that they were the only company capable of “providing a full spectrum of services focused on big data in a virtual private cloud.” According to Krishna Nimmagadda, Head of Marketing and Business Development at MetaScale. The partnership between MetaScale and Cloudera solves issues of Hadoop complexity by combining Cloudera’s Distribution Including Apache Hadoop (CDH), Cloudera Enterprise management software and Cloudera University training programs with MetaScale’s ability to host managed Hadoop clusters and associated data management technologies in the cloud.

While MetaScale isn’t the first (and certainly won’t be the last) to the offer cloud-based Hadoop services, they claim that their virtual private cloud can stand up to the robust data demands of enterprise users. The founding team’s experience managing cloud-based infrastructure and Hadoop environments for Sears Holding is important, but for the still-growing ecosystem around managed Hadoop offerings like theirs only time will tell how potential users will view the unchartered territories of both cloud and Hadoop—two areas that despite serious adoption make some potential explorers wary.

Others have already climbed aboard the hosted Hadoop express, some well before yellow elephants became synonymous with big data. The growing list of managed Hadoop providers includes smaller companies like San Diego-based Z Data, Inc., Virginia-based Global Computer Enterprises, and Mortar Data, not to mention services hosted on Amazon (EMR) and IBM’s SmartCloud Enterprise Infrastructure-hosted play, InfoSphere Big Insights, which allows for hosted Hadoop-based big analytics.

While that list is just a short-call, the fact remains that finding a way to nix the complexity of your own Hadoop cluster to have, hold and manage will get easier—and likely cheaper as more managed Hadoop services find their way into the market. Even without them, the main distro vendors (Cloudera, Hortonworks, MapR) are offering on-ramps for managing Hadoop clusters that are making it easier to tap into the much-touted power of the large-scale data handling framework.

Related Stories

Six Super-Scale Hadoop Deployments

Chips, Stats & Stones: A Morning with SAS CEO Dr. Jim Goodnight

How 8 Small Companies are Retooling Big Data

Datanami