Who’s Winning the Cloud Database War
Three-quarters of all databases will be deployed or migrated to the cloud within two years, Gartner said today in its much-anticipated report on cloud database management systems. The big cloud companies are winning their share of battles, but there’s plenty of market share available for smaller and nimbler database providers too.
Gartner turned heads in June 2019 when it declared that the cloud had become the default deployment mechanism for databases. “The message in our research is simple–on-premises is the new legacy,” wrote Gartner analysts Adam Ronthal, Merv Adrian, and Donald Feinberg. “Cloud is the future.”
Fast-forward 17 months, through the start of the COVID-19 pandemic and the ensuring lock-down on in-person work, and shift to the cloud has accelerated.
Now those Gartner analysts (in addition to Rick Greenwald and Henry Cook) have teamed up for a reprisal of that 2019 report. Last month, they published the inaugural “Magic Quadrant for Cloud Database Management Systems,” thus giving the category the weight and emphasis that they feel it’s due.
“The cloud DBMS market is not new,” the analyst wrote. “What is new is the growth in cloud revenue, the percentage of revenue in the cloud versus overall DBMS revenue.”
In 2018, Gartner estimated that cloud database services accounted for $10.4 billion of the $46.1 billion DBMS market, or about 23%. In its new Magic Quadrant report, the company says cloud databases brought in $17 billion out of the overall $55.4 billion market, or 31% of the total. The cloud was responsible for nearly all of the growth in the database market, the analysts found.
“We forecast that, by 2023, cloud DBMS revenue will account for 50% of the total DBMS market revenue,” Gartner says. (Maintenance fees, which are incredibly sticky, are likely the primary reason why on-premise databases will continue to bring in billions for companies like Oracle, IBM, Microsoft, and others into the future).
What’s interesting about Gartner’s report is that it doesn’t seek to separate transactional and analytical workloads. Instead, it lumps all the workloads together. That’s not to say that all the cloud databases are “translytical” in nature or have adopted the hybrid transaction analytical processing (HTAP) schemes. But it does say something about the nature of new and emerging database workloads, and how they are blending traditional elements, like transaction processing and data warehousing, with emerging workloads like data science exploration, stream/event processing, and operational intelligence.
With that blending in mind, it’s not surprising that a divide is emerging as to which approach cloud vendors are taking with their databases, i.e. are they adopting multimodel databases that can shape shift to adapt to different workloads, or are they adopting individual databases that are specialized for particular use cases, such as relational, graph, time-series, and geo-spatial. Multimodel databases can help cut down on data integration issues, but may sacrifice performance to the dedicated engines.
The other main axis separating cloud databases is how they run. Some offerings are contained to a single cloud, while others can run on multiple clouds or in a hybrid manner that blends cloud and on prem. Only a handful of databases can run “intercloud,” i.e. a single database with nodes running in multiple clouds at the same time (CockroachDB was mentioned by Gartner in this light, although it’s not included in the Magic Quadrant).
Cloud databases also differ in other ways, including their acceptance of open source, different governance models, and different pricing models. Gartner also evaluated the maturity of their data ecosystems, which has been a major focus area for Snowflake of late. With so much variability, the common architectural principal unifying cloud databases as a class, then, becomes the reliance upon cloud storage as the backend.
This group is bifurcated into two clusters. Amazon Web Services, Microsoft, Google, and Oracle formed one cluster furthest to the top and the right in Gartner’s leaders quadrant, followed by IBM, Teradata, SAP, and Alibaba Cloud a little lower down.
AWS is the dominant cloud provider in the space, according to Gartner, with the largest customer base and the highest revenue. It has a better track record than other hyperscalers, Gartner says. But its focus on “best-fit engineering,” which positions different databases for different uses, puts an integration burden on customers, Gartner says. It’s also more reluctant to embrace a multi-cloud world, the analysts say.
Microsoft Azure offer a broad range of database types and has a large and growing installed base, according to Gartner. “Unlike some of its CSP [cloud service provider] competitors, Microsoft has embraced a multi-model strategy for many of its data management offerings, which can simplify deployment,” it says. Its cloud data ecosystem, while not yet mature, is clearly defined. However, pricing is an issue, as some Gartner clients report that Azure is more expensive than their on-prem deployments.
Google has a better vision for multi-cloud, Gartner says, particularly with its BigQuery Omni offering, which supports the execution of SQL queries on AWS (support for Microsoft Azure is expected soon). It’s also has a more partner-friendly ecosystem that embraces commercial open source offerings, such as MongoDB and Elastic, into its offerings. It’s lagging in the cloud data ecosystem department, and lock-in remains a concern with its Cloud Spanner offering, Gartner says.
Oracle may not be not a hyperscaler, but its cloud offerings are well-adopted by enterprises and cover all the bases, ranging from transactional workloads and data warehousing to machine learning and big data (it offers Cloudera’s software formerly known as Hadoop). Its use of Real Application Clusters (RAC) for eliminating downtime for patching and upgrades gives Oracle an advantage over other cloud databases, Gartner says. However, only Oracle products run in the Oracle Cloud (save for the product formerly known as Hadoop). Oracle is also perceived to be pricey and hard to work with, but the company is making progress there, Gartner says.
IBM can also tick all the boxes with its Db2 cloud offerings, which include everything from relational database to object storage (Cloudant) and the IBM Event Streams, which is built on Apache Kafka. IBM has gone heavy on partners, as PostgreSQL, MongoDB, Elasticsearch, Redis, RabbitMQ, and DataStax all have a seat in the IBM Cloud. IBM also has a compelling multi-cloud strategy with support for AWS, Azure, and Google Cloud. The primary concerns with Big Blue are not technical, but reside in sales, marketing, and culture, according to Gartner.
Like other cloud database leaders, SAP offers a wide array of database types to large enterprises around the world. HANA is the name of the game here, with combined analytical and transactional processing, which can be augmented with predictive capabilities, graph analytics, OLAP engines, data virtualization, Hadoop, and Spark. Many SAP customers are unaware of the depth and breadth of SAP’s offerings, however. There is also a perception of high costs, not to mention difficulty motivating a large number of on-prem Business Suite customers to migrate to HANA in the cloud.
Teradata is a perennial power in data warehousing, and the company has successfully shifted its top SQL analytic capabilities to the cloud via its Vantage offering, which can run on public clouds, in Teradata’s own cloud, or even on-prem. The company also offers an array of other big data capabilities, including graph analysis and machine learning, which can be deployed in a federated manner. Price, once an issue, has been addressed with pay-as-you-go pricing, Gartner says. The biggest downside seems to be whether Teradata can continue to its premium rating with a range of “good enough” cloud data warehouses.
Alibaba Cloud offers a surprisingly wide array of database options under the ApsaraDB line, including multiple relational and NoSQL options, a time-series data store based on InfluxDB, a MapReduce compute engine, and a data warehouse based on Greenplum, among others. The cloud is widely adopted in Asia. But that’s also a weakness, as it has not gained wide adoption in North America.
Intersystems, which offers its multimodel IRIS database on AWS and Azure, was selected for the visionaries quadrant thanks to its long history as a database provider, mostly in healthcare. Customers are starting to add analytics workloads to their deployments, and the company is expanding into other markets.
Databricks is primarily known for its data engineering and machine learning cloud capabilities, but thanks to Delta Engine, which is based on a C++ re-write of its core Spark SQL engine to provide MPP analytics capabilities, it’s moving strongly into the data warehousing space. Currently Databricks runs only on AWS and Azure (where Microsoft resells it), but support for Google Cloud is expected soon.
MarkLogic, which has long provided its multimodel NoSQL database, is another player in the cloud data management space. Gartner likes its approach to integration, as well as support for both analytics and transactional use cases. MarkLogic, however, continues to suffer in the name-recognition department, according to Gartner.
Cloudera made the visionaries department, but only by a hair. Gartner likes Cloudera’s large and diverse customer base, its utilization of open source, and its approach to managing data across multi-cloud and hybrid setups. Concerns include the shift from older Hadoop products to the new Cloudera Data Platform, as well as resistance to the new pricing model that it brings.
Redis Labs received high marks for its database that exposes NoSQL, graph, and time-series personas, and works for transaction processing as well as analytics and AI. The ability to keep databases synchronized across multiple clouds was seen as a benefit. But the eventual consistency model, perception as a memory layer, and pricing are concerns.
Snowflake was one of the first data warehousing providers to separate compute from storage on the cloud, and so it has a certain first-mover advantage. Gartner also likes the “robust” data sharing capabilites that the company is building, as well as how easy it is to scale. Cost concerns, a lack of in-database analytics, and complexity when integrating data in host clouds are concerns.
Tencent offers a wide array of hosted database options, ranging from relational, NoSQL, time-series, graph, and stream processing. However, a lack of penetration outside of Asia pushes down its grade.
Huawei, similarly, gets high marks for technology, with its various GaussDB offerings for various use cases. The company has a very strong reputation in telecommunications, thanks to engineering of 5G. But again, penetration outside of China keeps it in the Niche Players category.
You can access a copy of Gartner’s report at this AWS webpage.