October 17, 2018

Inside Teradata’s Audacious Plan to Consolidate Analytics

Alex Woodie

(Anna_leni/Shutterstock)

Teradata, which told us last week to “stop buying analytics,” used its annual user conference this week to elaborate on that curious statement and explain its radical plan to dramatically simplify its customers’ analytics investments through massive consolidation of its competitive offerings under its new Vantage data platform.

To hear Teradata COO Oliver Ratzesberger explain it, top executives at Fortune 500 firms — and the boards that hold their purse strings — are simply fed up with big analytic investments that haven’t panned out, and they’re turning to Teradata for answers.

“The last five to 10 years have been a curse and a blessing,” Ratzesberger tells Datanami in an interview here at Teradata Analytics Universe in Las Vegas, Nevada, where approximately 3,000 Teradata customers, partners, and employees gathered for four-and-a-half days of training, education, and commiserating about failed analytic projects.

“There are few executives left who don’t say ‘I’ve spent billions of dollars. I have 1,500 clusters. I have Vertica there, Hana there, Greenplum there. We bought a couple instances of Netezza. But IBM just de-released Netezza, Vertica just got sold a second time, Greenplum is now this open source thing. And Hadoop – well, that is going away.’

“They’re literally coming to us and saying ‘Give us a proposal to clean up the dozens of instances and consolidate them into one,'” Ratzesberger continues. “And what they quickly figure out is, if you can run it with a handful of systems, TCO [total cost of ownership] is orders of magnitude different, because the TCO in most organizations include 2,000 headcount to run all of these technologies, and 2,000 headcount is a lot of money.”

De-Risking Analytics

Ratzesberger says he recently spoke with the head of risk at one of the largest banks of the world who runs 800 separate systems designed to measure risk. All told, the various Python, R, Spark, and Hadoop systems cost the company $2 billion per year.

“He said the board demanded that we decrease that run rate considerably,” the former eBay analytics director says. “None of these other solutions are scalable to that regard. Of course Hadoop clusters are scalable, as long as you only want 15 users or 15 applications, because it doesn’t deal with high concurrency. The solution has always been ‘Get another Hadoop cluster and another Hadoop cluster,’ but that got everyone into exactly that mess.”

The way out of that “mess,” according to Teradata, is the simplification of analytic product stacks, specifically, upon its newly announced Vantage platform. The product strategy, which is actually a continuation of the Teradata Everywhere strategy that the company launched two years ago, provides a centralized bundle of tools, technology, and applications that deliver a variety of analytic capabilities, with the Teradata relational database at its core, as well as an object storage system connected through a high-speed bus (more on that later).

The Vantage strategy, as it currently sits, places a variety of analytics engines on the core relational database, including its well-respected MPP engine, the Aster graph database, and a machine learning engine. The plan calls for Spark and TensorFlow engines to be added over time.

On top of these engines, Teradata will build support for a variety of languages, including SQL, R, and Python, with Scala, Go, and JavaScript in the works. It will also support analytic tools, including its own offerings as well as Jupyter, R Studio, and SAS, with support for Dataiku and KNIME coming in the future. Special accommodations are being made for handling special data types, such as time-series, temporal data, and geo-spatial analytics, as well.

Teradata Vantage architecture

Vantage’s object store connections will provide support for storing semi-structuerd and unstructured data, via hooks for Amazon’s S3 and Microsoft’s Azure BLOB store. The company also spoke of plans to support an on-premise object system at some point, but the details were hazy. And despite all the Hadoop-bashing at the event, Vantage will work with data stored in HDFS.

The grand plan calls for placing Teradata Vantage at the center of customers’ analytic strategies, and to accommodate the majority of customers’ analytic needs with Teradata-supplied tools and technologies. Pre-built application and customizable application templates will also play a sizable role in the Teradata strategy, while still providing enough room for partners to plug in their offerings as needed.

The centralization strategy will reduce some choice for customers, but it will also cut down on complexity, which Ratzesberger argues is the bigger problem. “More and more executives tell us…’I need outcomes. I need to reduce my churn. I don’t care what technology it is. It needs to be stable. It needs to be predicable. I need SLAs. It can’t be down. And I need to be able to implement quickly.'”

No More ‘Shiny Objects’

Ratzesberger says the recent announcement that Cloudera and Hortonworks plan to merge is a sign that Hadoop’s time is up, and that business executives are tired of chasing the next new thing to come out of open source.

Try not to look at the shiny object (LAVRENTEVA/Shutterstock)

Executives and boards are tired of investing billions into open source technology only to watch as the investments fail to generate the business results that are expected, he says. The fact that Cloudera calls identifies now as a data warehousing company is evidence that Hadoop itself is no longer relevant to business executives, he says.

“There’s a little bit of a shift going on now from having everything IT-led and chasing the next shiny object and always downloading the next open source software because it’s going to be solving the problems that the last one didn’t,” he says. “One of the smartest choices that Spark founders made was they said ‘We can run in Hadoop, but you don’t need Hadoop to run Spark.’ And technologies like TensorFlow — you don’t need Hadoop to run TensorFlow.”

That’s not to say that Teradata hasn’t made mistakes. While Ratzesberger didn’t call Teradata’s acquisition of Aster a mistake, he did admit that the company erred in how it handled Aster’s Hadoop-based technology.

“This goes back to focus,” he says. “We’re stopping a lot of things that we have done in the past that … lost revenues. Aster is a great example. Total distraction. Great componentry there. We took that and stuck it into Vantage. But now there’s just one team working on one platform, versus I had a Teradata team and a completely separate Aster team with a separate platform.”

While Teradata seeks to expose simpler collection of analytic solutions to customers, Ratzesberger acknowledges that the company will have to deal with complexity behind the scenes. However, Teradata is fortunate that it has already done much of the hard work of building a parallel processing engine, and it intends to exploit that work in Vantage.

No More Mr. Nice Guy

Teradata watched as Hadoop ate its lunch for years, and is now apparently ready to exact some measure of revenge. But it’s also clear that Hortonworks-Cloudera isn’t its main competition anymore. The cloud is.

Teradata COO Oliver Ratezberger delivered a keynote at TAU18 Monday (Image courtesy Teradata)

Teradata targets the 500 biggest analytic opportunities in the world, which is roughly defined as the Fortune 500, with some exceptions. Its parallel database has been utilized by many of these enterprises over the 39-year history of the firm, and that experience gives it the reliability that cloud-based offerings like Google‘s BigQuery can’t match, Ratzesberger says.

“Google is struggling as an enterprise player,” he says. “They’re doing some things right, don’t get me wrong. But for a company to say ‘I’m going to BigQuery,’ that’s a high-risk gamble that we’re taking. We don’t even know if BigQuery will be around couple of years from now. All the search appliances we bought from Google? They’re gone. They said ‘We’re stopping this. This is not working for us.'”

The company intends to exploit the work it’s already done to build its MPP analytical engine, which sits at the center of its Vantage strategy. Assertions that the database is only good for SQL workloads on structured data won’t fly, as Ratzesberger recounts how the company ran real-time analytic and NoSQL workloads on production Teradata systems at eBay.

“Parallel processing…is the only way to crack these massive data volumes. Parallel processing it turns out is very hard to do,” he says. “Yes there are hard problems to solve. No we have not solved all of them. But we have solved a lot of hard problems that Snowflake hasn’t even started to scratch the surface of.”

Ratezberger borrowed a phrase from Buzz Lightyear to demonstrate Teradata’s capability. “What we have is technology that scales to infinity and beyond, to the largest systems, data volumes, users, complexity. And rather than trying to … sell it all in pieces, we realized we have the killer weapon at our disposal and now let’s focus on the user experience. And Vantage is one piece towards that to make it simple for our customer to consume.”

However, Vantage is not the end goal of Teradata’s plan. The company recently embarked upon a 10-year plan to dramatically evolve its products. Ratzesberger would not disclose any details of the top-secret plan, saying those involved have been bound by strict non-disclosure agreements.

Needless to say, it’s evident that emerging technologies — like neural network architectures, deep learning, hybrid architectures, and containerization — will likely factor in. “Vantage is a higher-level platform that what a data warehouse was,” Ratzesberger says. “Vantage will be a big part [of the 10-year plan] but it will be a stepping stone.”

Cloudera and Hortonworks to Merge in $5.2 Billion Deal

Hadoop Has Failed Us, Tech Experts Say

Applications: Enterprise Analytics

Technologies: Frameworks

Sectors: Financial Services, Healthcare, Manufacturing, Retail

Vendors: Cloudera, google, Hortonworks, Snowflake, Teradata

Tags: analytics, big data, machine learning, Teradata, Teradata Universe, Vantage