Teradata Puts New Cloud Architecture to the 1,000-Node Test
Teradata says the recent 1,000-node test that it ran on AWS not only shows the scale that its new cloud architecture can achieve, but also demonstrates the analytic flexibility required by its new target market, the Global 10,000.
Teradata is a company on the move, and it’s moving both in terms of what it makes and whom it makes it for. It’s no longer developing tightly coupled, on-prem data warehouses for the largest corporations in the world. Instead, it’s creating de-coupled analytic software that runs anywhere–on prem, in the cloud, or in multiple clouds–and its market has widened considerably, from the Fortune 500 to the Global 10,000.
This background is important to understand why the Teradata Innovation Lab spent the time to take Teradata Vantage (the name of its cloud-based offering) for a 1,000-node spin. According to Teradata Chief Product Officer Hillary Ashton, the AWS cluster was more than twice as big as any production cluster run by a Teradata customer.
“We have some of the largest customers on-premises on the planet, so I think it’s a really great indication of where we’re heading in the future,” Ashton told Datanami. “Obviously it gives our large enterprise customers the comfort that our future is big enough for the largest enterprise workloads.”
The test, which took about four weeks to run and was announced one week ago, utilized 100 TB of data and simulated thousands of concurrent SQL queries submitted by more than 1,000 simulated users. The workload itself was a mix of quick-hitting operational queries that demanded fast response times, as well as more complex and longer-running decision support system (DSS) queries.
There are some details of the test missing, including the specific EC2 instance types that were used (Teradata says there were two types used), and the total cost of the system. The company says it will share more details in a white paper that will be published in a couple of weeks. But don’t expect pricing information, as this test was never intended to be a public price-performance benchmark.
“What we really wanted to do,” said Tim McIntire, senior vice president of software engineering for Teradata, “was show the stress of the system at scale, show its ability to scale, show how quickly we could do it, and show the flexibility of how you could build a system in cloud using our new architecture.”
The new architecture is Teradata Vantage, the company’s flagship cloud offering, which is built atop an S3-compatbile object storage system. Instead of running the traditional column-oriented MPP database atop bare iron, as is the customary deployment scenario, Teradata Vantage essentially abstracts the core elements of the eponymous Teradata database and reconstructs them in a file system (the Teradata Database File System), which itself runs atop the object store (the Native Object Store). This architecture provides the separation of compute and storage that today’s big data customers demand in the cloud.
But not all of the 100 TB was stored in S3 via the TDFS or NOS. “The decision support queries are longer running queries [and] all ran directly from object store with a Teradata file system sitting in object store,” McIntire said. “Then the SLAs for those tactical workloads required putting data closer to compute. So those ran out of EBS.”
Amazon Elastic Block Store (EBS) is a block-level storage service that can be attached directly an Amazon EC2 instances. It’s commonly used by relational databases, such as Amazon Relational Database Service (RDS), that require the fastest response times.
While the data itself may reside in S3 or EBS, depending on the data type, the workload, and the service level agreement (SLA), the customer isn’t responsible for managing it, Teradata says. Instead, the Teradata Vantage software is responsible for managing it.
“What we really wanted to show there is how we could spread workloads across the system in the cloud at scale without running into contention across the networking or the I/O side,” McIntire said.
Teradata Vantage includes all of the capability that Teradata customers are accustomed to finding in their Teradata deployments, according to Ashton. Some Teradata customers, including Volkswagen, are streaming Internet of Things (IoT) data into the Teradata’s object storage. The flexibility to store different data types on storage services at different price points gives customers the flexibly they are demanding, she said.
“You can put your data on EBS managed directly by Teradata Vantage, or you can have it on S3 or a different native object store system. Many of our customers choose to use both, for a variety of use cases, as well as streaming,” Ashton said. “The ability to have the optimization workload management capabilities tied to native objects is absolutely outstanding and a huge differentiator for us in terms of driving that total cost of ownership down.”
Teradata is competing in a data warehouse market that has shifted remarkably quickly from on-prem appliances to distributed Hadoop environments to the cloud. The company that was once the king of the hill in enterprise data warehousing is finding itself in the middle of a dogfight, with big dogs and little dogs alike.
It’s going up against the traditional data warehouse offerings of cloud giants like AWS, Microsoft Azure, and Google Cloud, as well as the offerings of popular upstarts like Snowflake and Databricks. It also must contend with storage-less query engine vendors like Dremio and Presto offering, such as Starburst and Ahana, not to mention established incumbents like IBM and Oracle.
This is why Teradata executed the 1,000 node test–to show to the world that it can still compete in a world that has rapidly changed.
“We wanted to demonstrate that we are bleeding edge in terms of the enterprise scale…in terms of mixed workload–the real work that actual large enterprises want to be able to do,” Ashton said. “They don’t want to have to duplicate data and separate it out and have DSS running in a different environment. That’s an important part of how we’re thinking about it, and we just want to let the market know that we’re here to play and we’re pretty ambitious about our future in the cloud.”
Ashton repeatedly highlighted Teradata’s query push-down capability–which enables it to push queries to other data warehouses, including data warehouses or even Hadoop running on-prem or in the cloud–as holding significant appeal to customers frustrated with the constant movement of data.
“We have about 5X as many analytic functions in our cloud environment as the closest competitor,” she said. “We think that AI and ML obviously are the main factors for why customers are moving to the cloud. We have the best solution on the market to handle analytics…for our customers without having to replicate data or move it.”