Cassandra Gets Monitoring, Performance Upgrades
The latest beta release of the Apache Cassandra is designed to hit the ground running as the NoSQL database moves steadily to the cloud to provide managed services in production deployments.
Cassandra 4.0 released on Monday (July 20) is the first major update of the database since 2017, incorporating more than 1,000 bug fixes and extensive “battle” testing to improve performance in production, making it “the most stable release ever,” maintainers asserted. Performance testing included running Cassandra on clusters as large as 1,000 nodes using an array of enterprise use cases.
Cassandra promoters note that hyper-scalers such as Apple (NASDAQ: AAPL) have deployed the database in production with more than 75,000 nodes, illustrating its ability to scale.
Among the new features incorporated into version 4.0 is the ability to stream data between nodes during scaling operations such as adding a new node or datacenter during peak traffic times.
It also includes the new data access controls operating on a “per datacenter basis.” In one scenario, operators of datacenters located in the Europe and the United States could configure Cassandra to allow access to a single datacenter using a “network authorizer” feature. Data governance features are gaining traction as European authorities crack down on the cross-border movement of personal user data.
Monitoring tools are also emphasized in the Cassandra latest release. Previously, open source tools from key code contributors such as DataStax and Instaclustr were the primary tools for observing Cassandra clusters.
“Constant monitoring of key performance indicators such as latency, disk usage, and throughput is critical to maintaining an optimal deployment,” Justin Cameron, a senior software engineer at Instaclustr, wrote last year in Datanami.
Around-the-clock “monitoring is necessary because both internal and external changes to Cassandra usage patterns are very common,” Cameron added.
The latest version allows users to selectively monitor system metrics and configuration settings via a feature called Virtual Tables. Other tools allow users to record and replay production workloads to analyze performance.
Along with DataStax, the data platform developer behind Cassandra, key code contributors to the 4.0 version include Amazon Web Services (NASDAQ: AMZN) and Instaclustr.
The Cassandra 4.0 better release is here.
August 12, 2020
August 11, 2020
- Matillion Data Loader Now Available in Snowflake’s Partner Connect
- ChaosSearch Names Ed Walsh, Recent IBM Storage GM, as Chief Executive Officer
- Parabola Raises $8M to Enable Everyone to Automate Repetitive Data Tasks, No Coding Required
- DataRobot Launches Pathfinder: A Comprehensive Library of 100+ AI Use Cases
- New BSC Spin-Off Provides a Cloud Platform for Fast Data Analytics
- Yugabyte Announces Second Annual Distributed SQL Summit
- Weka and Destiny Unveil Solution to Accelerate SaS Analytics Workloads
- Domo Releases Eighth Annual ‘Data Never Sleeps’ Infographic
- Yellowbrick Hosts First Annual Virtual Experience: ‘Answers for a World That Can’t Wait’
- NIH $2.5M Grant Will Support AI Approach to Study and Predict Excessive Drinking
- DOE Announces $8.5M for FAIR Data to Advance AI for Science
August 10, 2020
- New Relic and Grafana Labs Partner to Advance Open Instrumentation
- Qlik and Fortune Launch ‘History of the Fortune Global 500’ Data Analytics Site
August 7, 2020
- Sumo Logic Expands its Observability Suite with Added Solutions
- Google Cloud Delivers Enhancements to Looker that Optimize Performance, Accelerate Application Development
- Terbium Labs and DarkOwl Announce Partnership
- Mode Analytics Raises $33M in Series D Funding, Led by H.I.G. Growth Partners
August 6, 2020
- Online Applied Data Analytics Program Focuses on Data Decision-Making for Working Professionals
- Informatica and Google Cloud Expand Strategic Partnership with Deeper Integrations
Most Read Features
- Big Data File Formats Demystified
- Big Data Apps Wasting Billions in the Cloud
- How to Build a Better Machine Learning Pipeline
- What’s the Difference Between AI, ML, Deep Learning, and Active Learning?
- Is Python Strangling R to Death?
- To Centralize or Not to Centralize Your Data–That Is the Question
- How COVID-19 Is Impacting the Market for Data Jobs
- Is Hadoop Officially Dead?
- R Works Its Way Into Qubole’s Data Lake
- Hacking AI: Exposing Vulnerabilities in Machine Learning
- More Features…
Most Read News In Brief
- Left for Dead, R Surges Again
- Data Prep Still Dominates Data Scientists’ Time, Survey Finds
- Why Gartner Dropped Big Data Off the Hype Curve
- HPE Acquires MapR
- Researchers Explore Link Between American Individualism and Poor COVID-19 Response
- Global DataSphere to Hit 175 Zettabytes by 2025, IDC Says
- Kepler AutoML Targets Next-Gen Business Analysts
- Spark 3.0 Brings Big SQL Speed-Up, Better Python Hooks
- War Unfolding for Control of Elasticsearch
- Collibra, Tableau Team on COVID Data Catalog
- More News In Brief…
Most Read This Just In
- FortressIQ Launches Adaptive Computer Vision-Based Firewall for Data Privacy
- Cloudera Foundation Announces Grant Partnership with Urban Institute
- Orange and Google Cloud to Form Partnership in Data, AI and Edge Computing Services
- Syniti Acquires Virtyx Technologies
- KNIME Analytics Platform 4.2 is Now Available
- Hazelcast, Sorint Expand Partnership to Address In-Memory Computing Adoption
- Privacera Raises $13.5M in Series A Funding
- MariaDB Platform X5 Adds New Distributed SQL
- TileDB Closes $15M Series A to Expand its First Universal Data Engine
- The Apache Software Foundation Announces Apache APISIX as a Top-Level Project
- More This Just In…