Cockroach Labs Ready for Primetime with Scale-Out Database
Until recently, companies that need a distributed, ACID-compliant relational database to power global transactions didn’t have a lot of options. There was Google Spanner and not much else. But now a group distributed systems experts from Google says their new database, CockroachDB from Cockroach Labs, is ready for business.
Today, organizations have many different options available to them for managing data. On top of scale-up and scale-out relational databases, you have a raft of NoSQL databases, a variety of distributed file systems, and let’s not forget the object stores. These are all available in the cloud, on premise, and sometimes a combination of the two.
But sometimes, for some workloads, nothing but a good old ACID-compliant relational database will do, says Jim Walker, vice president of product marketing with Cockroach Labs.
“There’s a lot of workloads locked up in Oracle and SQL Server and Db2 and they aren’t moving because they’ve got to be right,” Walker said at the recent Strata Data conference. “Everybody is struggling with same thing. They want massive global scale, or scale within a single data center. But they want transactions. They want consistent transactions — not take NoSQL and put transactional layers on top of it — but a truly relational data store, like Spanner had done.”
The three founders of Cockroach Labs — Peter Mattis, Ben Darnell, and Spencer Kimball — watched as their Google colleagues developed Spanner into the world’s first globally distributed ACID-compliant database. Spanner achieved the difficult feat by relying on atomic clocks installed in Google data centers to achieve transactional consistency.
From their work developing Google Colossus (pdf), the second generation of the Google File System, Mattis, Darnell, and Kimball realized the significance of Spanner, and so they set out to build a similar system that could deliver the same transactional benefits, but without the reliance on atomic clocks installed in data centers.
CockroachDB achieves this feat using the RAFT consensus algorithm and multi-version consensus control (MVCC), which Walker calls the “rocket science” underlying the database. Also playing a role is network time protocol (NTP), which the database uses to track how closely the geographical distributed nodes in the database are in synch. The developers created logic in the database to “skew” the time across the various nodes, which makes it resilient to a certain degree of difference.
“If every node is aligned, that’s great,” Walker told Datanani. “But if one node is 10 milliseconds off, which happens, or 40 milliseconds off, we can actually deal with that within our execution engine. If a node gets too far off, what we do is just kill it, because that’s okay. The database will rebuild itself and it will actually bring it back to where it needs to be.”
CockroachDB is wire protocol-compliant with Postgres, which means there should not be too much work for a customer using Postgres to replace the database with CockroachDB. That’s not to say that you should think that you can just drop CockroachDb in for Postgres and be done with it. Replacing a database is never easy, and should be done with the utmost care, particularly for high-value transactional workloads.
“What will kill you is the corner cases,” Walker said. “You can’t have some corner case pop up. So it takes a while for databases to gestate. We’ve been building for four years. And we’ve just kind of started this whole go to market because now it’s at a point where the DB is extremely solid.”
While the underpinnings are solid, the database will continue maturing in the coming years, Walker said. The company continues to work through tough problems, including with etcd, the key-value store underlying Kubernetes. CockroachDB relies on Kubernetes, and its developers have fixed upstream problems in etcd, Walker said.
Cockroach Labs is beginning to be deployed use in these types of production systems. Telecom giant Comcast has adopted it. So has Baidu, WeWork, and Kindred Group, the Stockhom-based European gambling giant. When the company expanded its operations to New Jersey recently, CockroachDB’s ease of use really shined, Walker said. “They’re able to scale the database to Jersey because they just spin up some new nodes and point it at the cloister and the data gets spread out,” he said.
CockroachDB isn’t the only distributed ACID-compliant relational database on the market these days. Amazon Web Service (AWS) has Aurora, and Fauna has FaunaDB too. Cockroach Labs welcomes the competition, and is confident that it’s database will be what the market demands.
“We believe the future is cloud native and we believe the future is ultimately global,” he said. “There’s a lot of really smart CTOs at these financial institutions who are thinking, I can’t be Ubered. There’s going to be some other banking thing that comes up. How do we mitigate that risk?
“How do you do distributed transactions at scale?” he continued. “What’s that cloud native database that’s going to open that up? That’s what we feel is right.