Has FaunaDB Cracked the Code for Global Transactionality?
An organization that wants to power a transactional application using a single database that spans multiple data centers around the world without giving up ACIDity has few good options. One solution is Google Spanner, but it requires advanced hardware and is only available in Google data centers. Now a NoSQL database called FaunaDB is emerging that its creators claim has solved the global transactionality challenge using just math and software.
The origin story of FaunaDB starts at Twitter, where Evan Weaver and his colleagues struggled to build a distributed backend upon Apache Cassandra that could serve core critical business objects, such as tweets, timelines, and user profiles, in a timely and globally consistent manner, Weaver explains to Datanami.
The social media company could guarantee the accuracy of those items by running them in a single database housed in its Sacramento, California data center. But that resulted in poor experiences for some users in far-flung corners of the globe, and also complicated operational efficiency, says Weaver, who was employee number 15 at Twitter. Weaver and his infrastructure team eventually cobbled together one-off fixes to keep Twitter growing, but it wasn’t something that could be applied anywhere else.
“When we left Twitter…we decided that, essentially, since the problem hadn’t been solved in the meantime, that if we didn’t solve it, it would never get solved,” Weaver says. “We took a ground-up, blank-slate approach to what a modern cloud database really needed.”
Global Transaction Challenge
Weaver says there have been three attempts to solve the challenge of designing a transactional database that spans multiple data centers around the world without giving up the precepts of ACID, including atomicity, consistency, isolation, and durability. Google offered two of them, with its Spanner and Percolator offerings.
Percolator tackles the problem by using a single physical data center to be the “write master” for all transactions for any specific period of time. “It’s not appreciably different than your classic RDBMs scale-out approach,” Weaver says. “It’s more scalable within a single data center, but not beyond.”
Spanner is a more advanced approach that delivers true multi data-center, externally consistent transactions. Google achieves this using tight temporal synchronicity across the different database replicas that might be running around the globe to ensure that the data is correct. If a database records a write on one database replica, it puts a temporary lock on the table to guarantee that no other replicas can attempt a conflicting operation on that table within the time window that’s allowed for that transaction.
While Spanner works, it has its own set of drawback – namely, that getting those tight temporal guarantees across clusters that are separated by thousands of miles requires the use of advanced hardware, such as atomic clocks and GPS systems, and it’s only available on Google’s proprietary hardware and software stack, and only available on its network.
“That model falls apart when you’re trying to operate in the public cloud, or in an environment where you don’t have tight control over software and hardware infrastructure, because you have no idea really what the maximum possible latency of some transaction could be,” Weaver says. “Public cloud clock synchronization is typically no better than 500 milliseconds of tolerance, which is a lot when you’re talking about transactions that you commit within a few milliseconds.”
Weaver caught wind of an obscure 2012 paper authored by Yale computer science professor Daniel Abadi titled “Calvin: Fast Distributed Transactions for Partitioned Database Systems.” The paper, which you can download here, came out about the same time as the Google Spanner paper, but wasn’t given much attention at the time.
Abadi’s Calvin algorithm is the third approach to solving the global transactional challenge, but it takes a much different approach to building a highly scalable, low-latency, distributed database than Spanner does. Whereas Spanner achieves transactional synchronicity through hardware, Calvin achieves it through math.
“It’s a logical solution that has no hardware dependency at all,” Weaver says. “It’s purely using the interaction of the database software layer to manage the consistency of the transaction.”
The big innovation in Calvin is the use of a pre-processor, Weaver says. “It decides up front which transactions are going to happen in which order, and then it replicates that out to the other sites,” he says. “Essentially it does away with locks, which is what Spanner and Percolator use, and replaces it with pre-processing.”
While the Spanner and Percolator models require two global round-trips to other data centers to ensure correctness of the data, the Calvin algorithm only requires one global round-trip because it’s set the order of the transactions ahead of time, Weaver says.
“All it has to do is ship the transaction with the known order to the other replica sites, to make sure, for all intents and purposes, that a quorum or a majority of them have acknowledged it,” Weaver says. “We do, unfortunately, have to actually send information to the other data centers. The speed of light is not something we’ve been able to work around.”
Practical DB Applications
Professor Abadi, who is now teaching at University of Maryland, is an advisor to Fauna, the company that Weaver co-founded with CTO Matt Freels, a former Twitter tech. Abadi worked with the San Francisco company to implement the Calvin algorithm in a commercial database, dubbed FaunaDB.
Weaver describes FaunaDB itself as “document relational” database that borrows features and concepts from both relational SQL database and their non-relational NoSQL cousins. In addition to exposing data in relational and document forms, it can also be used as a graph database.
On the SQL front, Weaver says FaunaDB offers all the features of a standard RDBMs, including support for relational algebra, transactions, joins, foreign keys, constraints, indexes, views and stored procedures. “Basically everything that made those systems so durable and reliable and resilient, but it’s all brought forward into a cloud-native, developer-friendly, high-productivity, globally scalable context,” he says.
But FaunaDB also borrows ideas from the NoSQL databases, including storing data in the JSON document, as do Couchbase and MongoDB. “FaunaDB has everything good from NoSQL. Arguably the only things good from NoSQL are the developer productive of the new interfaces, and elastic scale-out,” he says, “and then restoring everything that was lost from the RDBMs.”
FaunaDB has been generally available for about two years, in both a managed cloud and an on-premise model. One of the companies using FaunaDB is Nvidia, where the database is powering a transactional system that manages the identities of users, including log-in data and data used for consumer-facing services.
“With FaunaDB, we’re able to support tens of millions of users with a small operational staff, and its advanced features like global replication let us maintain high availability and correctness even in the case of unexpected regional outages,” says Bill Wagner, director of cloud services for Nvidia, on the Fauna website.
Another early adopter is Capital One, a bank that’s known for adopting cutting-edge technology “…[W]hat drew us to FaunaDB Enterprise running in AWS was the potential of mainframe-like reliability and real, multi-region transactions in the cloud,” says Mike Fulkerson, vice president of core innovation at Capital One on the Fauna website.
Proof in the Pudding
Weaver is aware that not everybody will believe what he’s claiming his team has achieved at Fauna, which has raised $32.6 million in a Series A funding round, with investments from Point72 Ventures, GV (formerly Google Ventures), CRV, Capital One Growth Ventures, Data Collective, and Quest Venture Partners.
To dispel the doubts and doubters, the company is planning to publish a series of tests that show the product can in fact do what Weaver claims it can. “That proof is coming in under a year. We’re very excited about it,” he says. “At this point it’s really about putting a bow on the consensus model and the transaction model, and proving to everyone that it is truly the best.”
While the Fauna team has been working on this system since 2012, Weaver says he’s been pleasantly surprised to see the IT industry move right within FaunaDB’s sweet spot. The growing need for a hybrid delivery model that merges on-premise with multi-cloud systems will test the public cloud giants, and stress the databases that Amazon Web Services and Microsoft Azure can offer (while Google Cloud Platform offers Spanner).
“Everyone is moving to multi data-center, active-active, and even if it’s not now, it’s aspirational, and it’s where the industry is going,” Weaver says. “Your users live all over the globe. Why are they essentially using a single physical data center for everything? It’s a poor user experience. It’s a poor operations experience.”
Weaver sees a large potential user base for FaunaDB among companies that don’t to have to cobble together something like he did at Twitter.
“The broader set of Fortune 1000 companies have observed this transition and they don’t want to go through that painful mid-stage,” he says. “They want to go straight from the classic mainframe-oriented architecture, where your database runs on a single machine to something that’s truly global rather than taking these detours through what’s increasingly an architectural data center failure model.”
Weaver thinks it’s unlikely that somebody will come out of the blue with a better solution to the problem than Calvin or Spanner. It’s possible, he says, but unlikely. “At the end of the day, the real tradeoff was that the paper was harder to understand [than the Spanner paper],” he says. “It didn’t have Google’s reputation validating what you’re trying to do, as opposed to anything intrinsic in the architecture itself.”