June 17, 2015

Ex-Googler Now Helping Cloudera Build Hadoop

Alex Woodie

Cloudera scored a coup recently when it nabbed former Google executive Daniel Sturman to head up its engineering department. In an interview with Datanami, Sturman explains how he intends to use his experience designing distributed systems at the Internet giant to help evolve Hadoop.

Sturman was intimately involved in designing and running the software infrastructure that Google uses to run its massive online business. As vice president of engineering at Google, he led the teams responsible for the Google Compute Engine and Google App Engine. These systems weren’t based on Hadoop–Google was a big user of Hadoop and MapReduce early on and has since moved to other distributed systems. But the experience translates well to Hadoop, he says.

“What I’m bringing from that Google experience is an idea of how systems operate at large scale,” says Sturman, who reports to Cloudera CEO Tom Reilly. “You know you’re on a distributed system when some component that you never heard of fails and it brings you down. So I have a lot of experience…in successfully building systems to work at scale in distributed environments.”

Google was a pioneer in horizontally scaling commodity servers and it invested hundreds of millions, if not billions, of dollars to assemble top-notch engineering and development teams who were capable of building these systems from scratch. Ten years ago, Google’s competitors were doing the same thing–you’ll recall that Yahoo’s Doug Cutting based what would become the Hadoop Distributed File System (HDFS) in part on an obscure paper about the Google File System.

Fast forward to 2015, and Hadoop is on the cusp of giving people the same kind of distributed processing power that Google, Yahoo, and others worked so hard to build, but without all the blood, sweat, and tears. As Cloudera’s new vice president of engineering, Sturman is happy to be working with Cutting, who’s the chief architect at Cloudera.

“I’m very excited about where Hadoop is right now,” Sturman says. “Having seen how this stuff works at Google about the sort of insights you can get from data when you have the right tools at your disposal–I know the power of that. Google put a lot of time and engineers in building up that expertise, and rightfully Cloudera’s customers are a little bit more impatient. They want to unlock that power much faster. They don’t want to quite have that level of investment.”

Cloudera Vice President of Engineering Daniel Sturman

Sturman is just one week into his new job, which is barely enough time to map the routes to the office coffee machines, let alone properly introduce himself to all the members of Cloudera’s development team. But he has already outlined what he believes his role at Cloudera will be, and where he can have the biggest impact.

“This is an incredibly talented team here at Cloudera. They do their jobs very well and they know the community very well,” says Sturman, who was the driving force behind the Kubernetes container project while at Google and has a Ph.D. and master’s degree in computer science from the University of Illinois.

Sturman says his focus will be identifying the barriers to Hadoop adoption, and getting it past “the knee” in the adoption curve. “Cloudera has its competitors and we all compete for deals. But I think the real competitor is things that are blocking the people who aren’t using it from coming on board and using the technology. “I’m not quite sure where that will be in the technology stack yet. I’m absolutely sure it will involve the Apache community in one way or another. But I’m really going to be looking at what are those inhibitors and how do we make them go away.”

Hadoop’s cup is either half full or half empty, depending on how you look at it. This duality was evident in a recent report from Gartner analyst Nick Heudecker, who authored an April study that found 54 percent of enterprises were not investing in Hadoop and had no plans to, which he dubbed “anemic adoption” that runs counter to the hype. The other way to look at that data is that 46 percent of enterprises have either already adopted it or are investing in Hadoop. That’s certainly how Rob Bearden, the head of Cloudera’s competitor Hortonworks, sees it. “The opportunity that sits in front of us is simply staggering to me,” Bearden said at last week’s Hadoop Summit.

Sturman recognizes that the potential of Hadoop is massive, but says it’s not quite where it needs to be. As the director of development for IBM‘s DB2 for Linux, Unix, and Windows database, Sturman can list “enterprise software development” on his résumé, too.

The Hadoop stack needs some filling out, Sturman says. “Enterprises tend to have a number of needs and expectations, which come from the way they managed databases and data under traditional systems,” he says. “And while we’re dealing with very different scales here, I think not all those tools are in place. I think Cloudera is doing a great job leading the market on that, but there still needs to be continued focus there in order to really enable people to do this and be comfortable about how to get there.”

Many enterprises are already getting value out of Hadoop, but we’ve just seen the tip of the iceberg in terms of the potential impact that Hadoop can have on business computing, according to Sturman. “It’s certainly not brand new, but I think we’re starting to see it become broadly adopted for a growing set of well-understood use cases,” he says. “Where I think we need to get to is where it becomes much more of a core, enterprise-ready asset.”

Hadoop is broadly adopted today in the financial services, retail, and telecommunications industries today, and many of those engagements involved extensive technical services. As Hadoop adoption continues to spread in these industries, Sturman sees patterns emerging that will become the foundations for more standardized software offerings. “Within 18 months I think we’ll start seeing that with significant numbers, especially around core verticals,” he says. “We’re starting to better understand the problems and see the patterns so that stuff can really get productized in the not-to-distant future.”

It’s hard to separate the rise of Hadoop from the big data phenomenon. (At the recent Hadoop Summit, Hortonworks executives pondered whether Hadoop was driving the Internet of Things, or if the IoT was driving Hadoop.). Whatever the dynamics, the phenomena are intrinsically related and complementary.

Sturman says we’re “on the brink of something incredible” in how we get value from data. “When it all comes together–and I think it is coming together pretty soon–it gives you an exponential power effect, because they just build on top of each other,” he says.

Does Hadoop Need a Reality Check?

Congratulations Hadoop, You Made It–Now Disappear

Applications: Predictive Analytics

Technologies: Middleware

Sectors: Financial Services, Healthcare, Other, Retail

Vendors: Cloudera, google, Hortonworks, IBM

Tags: cloudera, distributed sytsems, google, Hadoop

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

Ex-Googler Now Helping Cloudera Build Hadoop

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 26, 2024

April 25, 2024

April 24, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

Ex-Googler Now Helping Cloudera Build Hadoop

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 26, 2024

April 25, 2024

April 24, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link