PayPal Open Sources Key-Value Store, JunoDB
PayPal last month released the source code for JunoDB, a distributed key-value store it developed internally and which today powers a variety of backend services at the payment site, including 350 billion transaction requests per day, the company says.
JunoDB was originally developed over a decade ago in C++ to address the specific needs of the company, according to a May 17 blog post by Yaping Shi, principal MTS, architect at PayPal. The company was moving to a microservices architecture that would require supporting a large number of persistent inbound connections to data stores, but the company’s IT architects couldn’t find a suitable product to support that approach.
“Since no commercial or open-source solutions were available to handle the required scale out-of-the-box, we developed our own solution to adopt a horizontal scaling strategy for key-value stores,” Shi writes.
The new database would addresses two primary scaling needs in distributed key-value stores, according to Shi: handling the a large number of client connections, and handling growth in read and write throughput.
PayPal database developers created JunoDB with a proxy-based architecture to enable horizontal scaling. The JunoDB client library, which resides in the application, was developed to enable simple data actions through the JunoDB proxy, which manage requests from the clients, coordinates with the data stored on the JunoDB storage server, and provides load balancing. All data is encrypted, either at the client or the proxy layer using TLS; all stored data is also encrypted using TLS.
JunoDB utilizes consistent hashing to partition data and minimize data movement. To support horizontal scale, it shards data among a number of database partitions located on server nodes. It also uses shards within shards, or “micro shards,” which serve as building blocks for data redistribution, Shi writes.
“Our efficient data redistribution process enables quick incremental scaling of a JunoDB cluster to accommodate traffic growth,” Shi writes. “Currently, a large JunoDB cluster could comprise over 200 storage nodes, processing over 100 billion requests daily.”
JunoDB has since been rewritten in Golang to provide multi-threading and multi-core capabilities. With JunoDB’s data replication methods, including within-data center and cross-data center replication, the key-value store delivers six 9’s of system availability for PayPal.
JunoDB has become a critical part of PayPal’s infrastructure, and powers almost all of the company’s applications today. That includes use as a temporary cache for data, to reduce loads on relational databases, as a “latency bridge” for Oracle applications, and to provide “idempotency,” or a reduction in duplicate processing.
“While other NoSQL solutions may perform well in certain use-cases, JunoDB is unmatched when it comes to meeting PayPal’s extreme scale, security, and availability needs,” Shi writes.
The database is named after Juno, who was the queen of heaven in Greek mythology.
PayPal has released JunoDB under a permissive Apache 2.0 license. You can download JunoDB from GitHub at github.com/paypal/junodb.