Microsoft Benchmarks Distributed PostgreSQL DBs
Which distributed PostgreSQL database is tops when it comes to transaction processing throughput? It’s a good question, and Microsoft attempted to find answers when it commissioned GigaOM to benchmark its Azure Cosmos DB for PostgreSQL offering against contenders from Cockroach and Yugabyte.
PostgreSQL is far from new, but its popularity has skyrocketed in recent years as developers and architects have rediscovered the benefits of the open source relational database. Many of the new PostgreSQL workloads have landed on the cloud, where AWS, Google Cloud, and Microsoft Azure have created their own PostgreSQL cloud database services.
Plain vanilla PostgreSQL scales vertically on a single computer footprint, but engineering groups have sought to develop horizontally scalable versions of the database that can run in a distributed fashion. CitusData, Cockroach Labs, and Yugabyte each have developed distributed databases that are wire-compliant with PostgreSQL. The cloud giants have also followed suit, with Google delivering a PostgreSQL interface for its Spanner database service. AWS has also been hinting at a globally scalable version of Aurora, its PostgreSQL-compatible database, although nothing has come to market yet.
Microsoft Azure’s entry into this horserace is Azure CosmosDB for PostgreSQL, which uses Citus under the covers to achieve horizontal scalability.
In order to drum up support for its product, Microsoft recently commissioned GigaOM to benchmark its Citus-powered distributed PostgreSQL database against two comparable managed service offerings: CockroachDB Dedicated and Yugabyte Managed. The plan originally was to including the PostgreSQL interface for Spanner in the test, but the offering “didn’t provide the Postgres compatibility required to run the benchmark,” GigaOM said in its April 18, 2023 report.
The benchmark tests, which were based on GigaOM’s derivation of the industry standard TPC-C benchmark, sought to gauge how the three relational databases performed under load. GigaOM wanted to use the HammerDB tool to create the workload for all three databases. However, CockroachDB wasn’t compatible, so it uses datasets used by Cockroach for its TPC-C testing instead.
The benchmark simulated the application workload for a real-world company that moves consumer product goods and operates physical warehouses (as opposed to data warehouses–this is OLTP country, not OLAP). At the 1,000 warehouse level, the databases are asked to handle SQL queries regarding 30 million customers, 100 million items, 30 million orders, and 300 million order line items. Tests were also performed at the 10,000 and 20,000 warehouse levels.
GigaOM says it did the best it could to size the cloud environments for these tests. The Cosmos DB for PostgreSQL ran in Microsoft Azure (obviously) while CockroachDB Dedicated and YugabyteDB Managed ran in AWS. Both CockroachDB and YugabyteDB were given 14 worker nodes, each with 16 virtual CPUs, 64 GB of RAM, and 2,048 GB of storage (solid state, presumably). No information was provided for the coordinator node for these databases.
Cosmos DB for PostgreSQL was given 12 worker nodes, each with 16 vCores, 128 GB of RAM (twice the amount of RAM as its competitors), and 2,048 GB of storage. The coordinator node was a single 32 vCore instance with 128 GB of RAM and 512 GB of storage. GigaOM tweaked the default Cosmos DB for PostgreSQL setting for worker memory to 16MB and set “pg_stat_statements.track” to “none,” it says in its report. “These settings are not configurable for the fully-managed versions of YugabyteDB and CockroachDB,” it says.
The benchmark results report shows Azure CosmosDB for PostgreSQL winning all of the categories that are mentioned in the report. (If you’re new to database benchmarks, that might surprise you.)
For example, in the “best new orders per minute” category, Azure CosmosDB for PostgreSQL trounced its competitors, with a 1.05 million NOPM rating compared to 178,000 for CockroachDB and 136,000 for YugabyteDB. (NOPM is considered the equivalent of transactions per minute,” a standard TPC-C metric.) These best NOPM figures were generated at the 20,000 warehouse level. However, Azure CosmosDB for PostgreSQL’s best NOPM figure was from the 1,000 warehouse test (GigaOM ran the 10,000 and 20,000 warehouse tests after discovering the server utilization were only around 20% for the 1,000 warehouse test.)
“Azure Cosmos DB for PostgreSQL achieved over five times more throughput than the CockroachDB Dedicated and YugabyteDB Managed configurations…” GigaOM says in its report. “On this day, for this particular workload, with these specific configurations, Azure Cosmos DB for PostgreSQL had higher throughput than CockroachDB and YugabyteDB.”
In terms of the total cost of the configuration, Azure CosmosDB for PostgreSQL (not surprisingly) comes out the winner, with a $34.91 per hour cost to run the infrastructure on Azure versus $62.17 per hour to run the CockroachDB setup on AWS and $57.63 per hour to run the YugabyteDB setup on AWS. In terms of monthly costs, the Microsoft option was considerably less than its two competitors, the report shows.
Marco Slot, a principal software engineer at Microsoft, provided some caveats and color to the GigaOM benchmark in a June 21 blog post.
“Benchmarking databases, especially at large scale, is challenging–and comparative benchmarks are even harder,” he wrote.
Slot says one of the reason why Azure Cosmos DB for PostgreSQL is so fast is due to a concept in Citus called “co-location.”
“To distribute tables, Citus requires users to specify a distribution column (also known as the shard key), and multiple tables can be distributed along a common column,” Slot writes. “That way, joins, foreign keys, and other relational operations on that column can be fully pushed down.”
Also benefiting Team Microsoft is the capability in Citus to “scope” transactions and stored procedures to one specific distribution column value, which allows them to be “fully delegated to one of the nodes of the cluster,” thereby boosting scalability, Slot says.
In the end, it’s about tradeoffs, Slot says.
“The decision to extend Postgres (as Citus did), fork Postgres (as Yugabyte did), or reimplement Postgres (as CockroachDB did) is also a trade-off with major implications on the end user experience, some good, some bad,” he says. “CockroachDB and Yugabyte make different trade-offs and do not require a distribution column. Engineers like talking about the CAP theorem, though in reality there are many thousands of tricky trade-offs between response time, concurrency, fault-tolerance, functionality, consistency, durability, and other aspects.”
But every application is different, of course, and each user should decide for themselves which tradeoffs they are willing to make.