Microsoft Extends Cassandra Rings with NoSQL Database Preview
Microsoft today announced the public preview of a new Cassandra database service for Azure Cosmos DB. The service, called Azure Managed Instance for Apache Cassandra, extends previous work that Microsoft did with Cassandra, and will enable customers to build new Cassandra databases and to build hybrid setups that extend existing Cassandra database rings to Microsoft’s service running in the Azure cloud.
Azure Cosmos DB is a globally distributed multi-model NoSQL database that Microsoft launched in 2017 in its Azure cloud. Back then, the offering launched with five APIs, allowing customers to treat the Cosmos DB data as SQL tables, MongoDB JSON documents, as nodes in a graph (Apache Gremlin), as a key value store (etcd) and as Cassandra tables.
With the new Azure Managed Instance for Apache Cassandra, Microsoft is giving customers more of the big data benefits of the Cassandra NoSQL database, particularly as it pertains to scalability and data replication. It’s no longer relegated to replying to CQLv4 queries, which is what the Azure Cosmos DB Cassandra API brought.
With the new offering, Microsoft says it’s delivering a fully native Cassandra database. Along with this comes the capability to mix and match Microsoft’s new hosted Cassandra database service with customers’ existing Cassandra databases in a hybrid cloud deployment.
“You can use this [Azure Managed Instance for Apache Cassandra] service to easily place managed instances of Apache Cassandra datacenters, which are deployed automatically as virtual machine scale sets, into a new or existing Azure Virtual Network,” Microsoft posted in a blog today. “These data centers can be added to your existing Apache Cassandra ring running on-premises via Azure ExpressRoute in Azure, or another cloud environment.”
Apache Cassandra, which was developed at Facebook based on ideas taken from Google BigTable and Amazon Dynamo, is an open source wide-column NoSQL database that is primarily used to store and retrieve large data sets with a high degree of reliability. Scalability, fault tolerance, and resilience are the main strengths of this database, which can be deployed as a single instance, or ring, spanning multiple geographical locations.
While Cassandra’s strength lies in keeping data available, the database management system is notoriously difficult to run and maintain. That has driven some companies to offload the care and feeding of the database to third-parties, including cloud providers and DataStax, the commercial vendor behind the open source project.
DataStax says it welcomes the attention that Microsoft has put on Cassandra.
“At DataStax, we are seeing strong and growing interest in Cassandra, as requirements for today’s modern workloads move toward precisely what it was built for: distributed data at planetary scale, with 100% uptime,” Ed Anuff, DataStax chief product officer, said in a statement emailed to Datanami. “Given this demand, we expect to see a variety of services for Cassandra in the market, but most will not provide the freedom of multi-cloud or the elasticity of a serverless architecture.”
Anuff pointed out that last week, DataStax launched what it bills as “the only open, multi-cloud serverless cloud database.” DataStax’s Astra offering runs on Azure, Google Cloud, and AWS, and supports APIs for access data via JSON, REST, and GraphQL.
“By decoupling compute from storage, DataStax’s Astra service lets users take advantage of the innate elasticity of the cloud for data, with a cloud agnostic database,” IDC analyst Carl Olofson said in the DataStax announcement. “DataStax has introduced a number of new technologies and services to make Cassandra more interesting for a wider range of workloads, and given its cloud elasticity, serverless Astra represents another important milestone.”
Microsoft made the Cassandra announcement at its Ignite conference, which is taking place virtually this week.