Follow Datanami:
July 28, 2021

Is Now the Time for Database Virtualization?

(fullvector/Shutterstock)

Over the past 20 years, almost all elements of the IT stack have been virtualized. We have virtual storage, virtual networks, and virtual servers. But one part of the stack is conspicuously absent from the virtualization story: the database. Is it time to virtualize the database? Some say the timing couldn’t be better.

Database virtualization is not a new concept, but it’s also not a widely implemented one. In one type of database virtualization, such as that practiced by the San Francisco company Datometry, a layer of abstraction is inserted between the database and the application. The emulator masks the difference between databases, enabling customers to move databases much more easily than before.

According to Datometry CEO Mike Waas, database virtualization has the potential to give combines much greater freedom to use other databases.

“Getting off of a database is just everybody’s nightmare,” the database veteran told Datanami. “Everybody has been, for the last 50 years, kind of suffering from vendor lock-in on databases, but nobody has ever really done anything about it. That’s what we want to change.”

Datometry’s offering, called Hyper-Q, currently targets the Teradata analytical database, and support for Oracle‘s Exadata appliance is due later this quarter. According to Waas, who cut his teeth on databases with Microsoft in the 1990s before working at Amazon.com and Greenplum, companies often budget $20 million to $30 million over a period of three years to migrate off a midsize Teradata appliance. However, the actual projects often take upwards of $50 million, with just a 15% success rate.

“Which in the past was maybe a viable decision,” he said of the failure to fully decommission an OLAP system. “But as you move to the cloud, and if you really want to get rid of the hardware and the incumbent system, that’s no longer an option. So we allow them to really kiss that thing goodbye, move everything they’ve got and decommission the old box.”

A Database Emulation

Datometry’s offering essentially is an emulator for databases. The company is focused primarily on column-oriented relational databases, or OLAP systems, such as Teradata’s popular offering. There’s no reason its database emulator couldn’t be used for OLTP systems too, but data warehouse migrations tend to be the most costly and painful, so the company is starting there.

Here’s how Waas describes the product:

“We intercept the communicating from the application, unpack the requests, take out the SQL, and then do what effectively is almost like what the upper half of any database does, meaning build an entire algebraic model for the incoming request, optimize that, and then synthetize what the optimized SQL means for that destination,” he said.

The Datometry architecture (Source: Datometry white paper “Rapid Adoption of Cloud Data Warehouse Technology Using Datometry Hyper-Q”)

Once the Datometry software has dicovered the defining characteristics of the source database, then a replacement solution, consisting of the real-time workload translation layer, can be deployed in the field to support the customers new database system. Running in the customers’ virtual private cloud (VPC), the Datometry solution sits between the requesting system, such as a Tableau or Looker BI client, and new the data warehouse that the customer chose, which is likely Amazon Web Services‘ Amazon Redshift, Google Cloud BigQuery, or Microsoft Azure Synapse Analytics.

The key advantage that this approach has, Waas says, is that none of the analytics applications know that they’re not talking to a Teradata data warehouse anymore. As the  Tableau or Looker BI client fires off SQL queries, or as the Informatica or Talend ETL tool loads source data into the warehouse, the Datometry emulator interprets the requests and tweaks them as necessary to account for the differences between the old Teradata and the new Redshift/Synapse/BigQuery system.

Waas says Datometry has done its homework in developing its solution to account for the specific design and peculiarities of Teradata and Oracle systems, which are extremely complex analytical machines with many moving parts. He said that out of the box, Hyper-Q can replicate 99.6% of the Teradata functions. The one caveat is that Datometry so far hasn’t developed support for XML. (That sound you hear is MarkLogic executives breathing a sigh of relief.)

“Teradata has wonderful things: macros, stored procedures. You name it, we do all of that,” Waas says. “Even if your new destination database doesn’t have stored procedures, we give you stored procedures because a stored procedure is effectively a set of, or string of, SQL statements connected with control flow. And so we actually interpret the control flow and execute the SQL statements, let’s say against BigQuery or Synapse.

“So you get full fidelity of the stored procedure, with all the goodness of error handling and go-to statements and you name it,” he continued. “But it’s not executing in the database. The control flow is executed in Hyper-Q our product, but all the heavy lifting is done in the database.”

A Grander Vision

Datometry prices its offering based on estimates of workloads. For a typical midsize Teradata system, the cost would be a few hundred thousand dollars per year, Waas says. The migrations are completed at one-tenth the cost and one-tenth the speed of a traditional migration handled by a systems integrator, with a 90% win-probability, he claims.

After spending a career as a database customer and working for database vendors, Waas has seen the huge impacts that sticky database can have.

When he started with Amazon.com in 2005, he was involved in the early stages of the online bookseller’s migration off the Oracle database that finally culminated last year – a 15 year project. Facebook similarly took a couple of years just upgrading from one version of Postgres to another.

Datometry hopes to help customers celebrate freedom from database lock-in (Pakthongchai/Shutterstock)

Customers aren’t the only ones hurt by database lock-in, Waas says. When database vendors improve their product, customers are often unable to upgrade to it to take advantage of their features, which hurts the vendors’ long-term prospects.

“Had we done Datometry 10 years ago, you would have gone from Oracle with a fixed hardware footprint to a SQL Server with a fixed hardware footprint. Tomato, toh-mah-to,” he says. “The value would have been at best 10% to 15%. There wasn’t a quantum leap in value.”

But the cloud does provide that quantum leap in value, with its scalability, performance, and app marketplaces, Waas says. Finally, after years of treating the pain of database migrations with topical ointment, a cure may finally be at hand.

“It really crystalized as, this is finally the moment in history where there is an external force that shift everybody from on-prem to the cloud,” he says. “This is an ideal opening, so to speak, for this technology to move from one database system to another, to bring virtualization to the table.”

Database migrations are the most obvious use for database virtualization. But the Datometry vision is bigger than that, Waas says.

“Think of VMware,” he says. “Twenty years ago, people looked at it as a consolidation tool for multi-core.  Nobody looks at this today in this manner anymore. There’s so much functionary that VMware has bult.”

Datometry has the same thinking. Once you’re in the data path, Waas says, there’s much more functionality that you can build, including management, orchestration, security, profiling and optimization.

“Fast forward a couple of years, I believe nobody will connect an application directly to a database anymore, just like nobody puts enterprise software on a bare-metal server today,” he says. “The script has been completely flipped by virtualization in the last 15 years and I expect the same kind of thing to happen in the database space.”

Related Items:

Database Migrations Shift Into High Gear

Who’s Winning the Cloud Database War

Cloud Now Default Platform for Databases, Gartner Says

 

Datanami