November 6, 2013

OLTP Clearly in Hadoop’s Future, Cutting Says

Alex Woodie

Think Hadoop is just for analytics? Think again, says Hadoop creator Doug Cutting, who last week predicted that, in the future, organizations will run all sorts of workloads on their Hadoop clusters, even online transaction processing (OLTP) workloads, the last bastion of the relational legacy.

Cutting didn’t don a wig or fancy robe when he made his predictions about the future of Hadoop during a speech at the Strata + Hadoop World conference last week. He didn’t wave a magic wand or use a crystal ball. Instead, the plain-speaking technophile made his points by tapping into his own vast repository of knowledge on the topic. Oh, and PowerPoints.

“I don’t have a time machine. I can’t see the future any better than you can,” Cutting said. “I’m a guy who, in the past, looked at the present, looked at facts, and decided what to do next. I’m not attempting to look too far down the road.”

But as chief architect for the leading Hadoop distributor Cloudera, it’s in Cutting’s job description to have some idea where it’s headed. Besides, it was Cutting himself who set this ball into motion 10 years ago when he started writing this software product that’s having such a big impact on the IT industry and, arguably, the world at large. Clearly, the guy has an opinion on the matter, and that opinion clearly matters.

The basic facts, as Cutting sees them, are pretty clear. It all starts with Moore’s Law, which has given us continuous exponential increase in computing power for close to 50 years. “I wouldn’t bet against it continuing to improve,” he said. “We’ll be able to store and process more data in the future than we can today.”

Much of that data will be stored and processed in Hadoop, if Cutting’s predictions about Hadoop turning into an operating system kernel for a data-centric platform turn out to be accurate. Obviously, Hadoop can’t be a kernel in the same sense that Linux has a kernel or that Windows has a kernel. What Cutting means is that Hadoop will become the de facto standard on which developers will build applications in the future.

What started out as a limited, unsecure, and unreliable system for processing Java workloads has matured into a scalable, secure, and reliable platform for running all sorts of applications, Cutting said. “We saw initially higher level languages, Pig and Hive, that removed the requirement that you be a Java programmer to make use of this,” he said. “Then we started to see, in parallel, the addition of real-time components. First HBase providing a NoSQL API, then Impala with interactive SQL, and more recently, search.”


Hadoop is clearly just getting started, as this slide from Cutting’s presentation demonstrates.

It doesn’t take a data scientist to do a basic extrapolation of recent events around Hadoop, and see that it’s going somewhere. “More and more types of workloads will be supported on top of Hadoop,” Cutting said. “It’s a clear trend. In the near future, we’re seeing Spark in-memory streaming, graph–all kinds of new processing metaphors moving to this platform, providing you with new tools to combine, view, analyze, understand your data. And that, we can expect to continue.”

If this sounds a lot like the “Enterprise Data Hub” future for Hadoop that Cloudera CEO Mike Olson shared with the world last week, that’s because it is. “How far can we go with this? What’s the limit here?” Cutting asked. “My belief is the sky is the limit. It’s hard to imagine a kind of a workload that you can’t move to this platform.”

Obviously, there have to be limits, even if we can’t see them. But according to Cutting–who had the foresight to see that a new software platform would be needed to solve the problems of the future–the limits do not extend past running OLTP. There’s no reason why OLTP can’t run on Hadoop, he said.

Than in itself is a change of tune for the highly scalable pachyderm. “Transactions are something that were long thought to be something out of scope for this style of platform,” he said. “It’s an important class of workload that is currently well served, but not by the Hadoop platform.”

That will change, he predicted. In particular, Cutting cited the work that Google is doing in this regard. Google published a paper a year ago that described an internal system it built on their platform “that’s very similar to Hadoop,” and that can run OLTP. The paper “demonstrates that it’s possible to bring OLTP to this style of platform,” he said.

“In the past, when we’ve seen that it’s possible, within a few years, it happens,” he said. “The prediction we can make here is it’s inevitable that we’ll see just about every kind of workload move to this platform, even online transaction processing.”

To be sure, there are vendors looking to build transaction processing on the Hadoop backbone. Just this week, we covered Splice Machine’s plan to bring standard, SQL-compliant transactional capabilities to the NoSQL HBase database that resides atop Hadoop, but there are others.

Cutting cuts an unlikely figure for an IT superhero, but he wears his fame well. In a parallel universe, Hadoop’s rise to prominence may have never come to pass. It’s all very fatalistic, and, in a way, out of Cutting’s hands. “In the early days, I expected there to be multiple systems like Hadoop, competing to potentially become a platform,” he said. “And really nothing else has emerged. Hadoop has come to dominate the big data space, and it’s becoming really the kernel of the de facto standard operating system for big data.”

It may be a stretch to say that Hadoop single handedly started the big data revolution. After all, organizations have been pushing the limits of their data storage and data utilization capabilities for decades. But the idea that, with Hadoop, you never have to throw data away, ever, has had a fundamental impact on how we think about data, and on how we use can use data.

“We’re in the middle of a revolution in data processing,” Cutting concluded. “Revolutions are scary times. Folks aren’t sure what’s going to come next. They’re not sure what allegiances to make, what path there is to follow. Hadoop I think provides a clear path that will endure into the future supporting wide varieties of workload and I think you can be comfortable adopting Hadoop for your data needs.”

At least until the next big thing comes along.

Hadoop Version 2: One Step Closer to the Big Data Goal

Cloudera Articulates a ‘Data Hub’ Future for Hadoop

Applications: Enterprise Analytics

Technologies: Middleware

Sectors: Other

Vendors: Cloudera

Tags: Hadoop, HBase, Hive, oltp, pig

Only registered users may comment. Register using the form below.

Check off newsletters you would like to receive*
- HPCwire
- EnterpriseTech
- Datanami
- Technology Conferences & Events
- Advanced Computing Job Bank
- Technology Product Showcase
Email*
Name*
First Last
Organization*
Job Function*
Industry*
Country*
City*
State*
Province*
- Please check here to receive valuable email offers from Datanami on behalf of our select partners.

OLTP Clearly in Hadoop’s Future, Cutting Says

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 26, 2024

April 25, 2024

April 24, 2024

Sponsored Partner Content

Get your Data AI Ready – Celebrate One Year of Deep Dish Data Virtual Series!

Supercharge Your Data Lake with Spark 3.3

Learn How to Build a Custom Chatbot Using a RAG Workflow in Minutes [Hands-on Demo]

Overcome ETL Bottlenecks with Metadata-driven Integration for the AI Era [Free Guide]

Gartner® Hype Cycle™ for Analytics and Business Intelligence 2023

The Art of Mastering Data Quality for AI and Analytics

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Top 6 Strategies for Reducing Data Warehouse Costs

Building an Operational Data Warehouse for Real-time Analytics

Sponsored Multimedia

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Featured Events

AI & Big Data Expo North America 2024

CDAO Canada Public Sector 2024

AI Hardware & Edge AI Summit Europe

AI Hardware & Edge AI Summit 2024

CDAO Government 2024

OLTP Clearly in Hadoop’s Future, Cutting Says

Join the discussion Cancel reply

Only registered users may comment. Register using the form below.

April 26, 2024

April 25, 2024

April 24, 2024

Most Read Features

Most Read News In Brief

Most Read This Just In

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Featured Events

Share

Copy short link