Follow Datanami:
July 27, 2021

Apache Cassandra Project Releases Apache Cassandra v4.0

WILMINGTON, Del., July 27, 2021 — The Apache Cassandra Project released today v4.0 of Apache Cassandra, the Open Source, highly performant, distributed Big Data database management platform.

“A long time coming, Cassandra 4.0 is the most thoroughly tested Cassandra yet,” said Nate McCall, Vice President of Apache Cassandra. “The latest version is faster, more scalable, and bolstered with enterprise security features, ready-for-production with unprecedented scale in the Cloud.”

As a NoSQL database, Apache Cassandra handles massive amounts of data across load-intensive applications with high availability and no single point of failure. Cassandra’s largest production deployments include Apple (more than 160,000 instances storing over 100 petabytes of data across 1,000+ clusters), Huawei (more than 30,000 instances across 300+ clusters), and Netflix (more than 10,000 instances storing 6 petabytes across 100+ clusters, with over 1 trillion requests per day), among many others. Cassandra originated at Facebook in 2008, entered the Apache Incubator in January 2009, and graduated as an Apache Top-Level Project in February 2010.

Apache Cassandra v4.0

Cassandra v4.0 effortlessly handles unstructured data, with thousands of writes per second. Three years in the making, v4.0 reflects more than 1,000 bug fixes, improvements, and new features that include:

  • Increased speed and scalability – streams data up to 5 times faster during scaling operations, and up to 25% faster throughput on reads and writes, that delivers a more elastic architecture, particularly in Cloud and Kubernetes deployments.
  • Improved consistency – keeps data replicas in sync to optimize incremental repair for faster, more efficient operation and consistency across data replicas.
  • Enhanced security and observability – audit logging tracks users access and activity with minimal impact to workload performance. New capture and replay enables analysis of production workloads to help ensure regulatory and security compliance with SOX, PCI, GDPR, or other requirements.
  • New configuration settings – exposed system metrics and configuration settings provides flexibility for operators to ensure they have easy access to data that optimize deployments.
  • Minimized latency – garbage collector pause times are reduced to a few milliseconds with no latency degradation as heap sizes increase.
  • Better compression – improved compression efficiency eases unnecessary strain on disk space and improves read performance.

Cassandra 4.0 is community-hardened and tested by Amazon, Apple, DataStax, Instaclustr, iland, Netflix, and others that routinely run clusters as large as 1,000 nodes and with hundreds of real-world use cases and schemas.

The Apache Cassandra community deployed several testing and quality assurance (QA) projects and methodologies to deploy the most stable release yet. During the testing and QA period, the community generated reproducible workloads that are as close to real-life as possible, while effectively verifying the cluster state against the model without pausing the workload itself.

“In our experience, nothing beats Apache Cassandra for write scaling, and we’re looking forward to the performance and management improvements in the 4.0 release,” said Elliott Sims, senior systems administrator at Backblaze. “We rely on Cassandra to manage over one exabyte of customer data and serve over 50 billion files for our customers across 175 countries so optimizing Cassandra’s capabilities and performance means a lot to us.”

“Apache Cassandra’s contributors have worked hard to deliver Cassandra 4.0 as the project’s most stable release yet, ready for deployment to production-critical Cloud services,” said Scott Andreas, Apache Cassandra Contributor. “Cassandra 4.0 also brings new features, such as faster host replacements, active data integrity assertions, incremental repair, and better compression. The project’s investment in advanced validation tooling means that Cassandra users can expect a smooth upgrade. Once released, Cassandra 4.0 will also provide a stable foundation for development of future features and the database’s long-term evolution.”

Apache Cassandra is in use at Activision, Apple, Backblaze, BazaarVoice, Best Buy, Bloomberg Engineering, CERN, Constant Contact, Comcast, DoorDash, eBay, Fidelity, GitHub, Hulu, ING, Instagram, Intuit, Macy’s, Macquarie Bank, Microsoft, McDonalds, Netflix, New York Times, Monzo, Outbrain, Pearson Education, Sky, Spotify, Target, Uber, Walmart, Yelp, and thousands of other companies that have large, active data sets. In fact, Cassandra is used by 40% of the Fortune 100. Select Apache Cassandra case studies are available at https://cassandra.apache.org/case-studies/

In addition to Cassandra 4.0, the Project also announced a shift to a yearly release cycle, with releases to be supported for a three-year term.

Availability and Oversight

Apache Cassandra software is released under the Apache License v2.0 and is overseen by a volunteer, self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project’s day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Cassandra, visit https://cassandra.apache.org/

About Apache Cassandra

Apache Cassandra is an Open Source, distributed, wide column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients. Apache Cassandra is used in some of the largest data management deployments in the world, including nearly half of the Fortune 100.


Source: Apache Cassandra

Datanami