Follow Datanami:
May 12, 2020

Cassandra Now Officially In the Cloud with DataStax Astra


Databases are expected to run in the cloud by default now. We know this because Gartner told us last year. And with today’s launch of Astra, DataStax’s distribution of the Cassandra NoSQL database is now formally available in the public cloud.

Astra has been a long time coming for DataStax. It was first unveiled a year ago as Constellation, but DataStax decided to rebranded its Cassandra cloud service Astra last fall, when the beta was unveiled. More than 1,000 beta testers later, Astra (which means “star” in Greek) is now generally available as a fully managed service running on Google Cloud and Amazon Web Services (support for Microsoft Azure is in the works).

With Astra, customers will get the extreme scalability and data availability that they have come to expect with Cassandra, but without all of the technical hurdles that a Cassandra deployment typically entails, says Matt Kennedy, DataStax’s senior director of cloud solutions.

“A lot of times people get into trouble by adjusting knobs that they don’t understand, that really have a lot to do with the compute capabilities they’re running on,” he says. “Because we know that with absolute predictability, we can take a way a lot of the configuration that is there to potentially trip people up.”

The company has also added a set of guardrails to Astra that prevent users from misconfiguring their Cassandra instances, such as limiting the number of tables in a cluster to 200, and limiting the size of an individual column cell storage to 5MB, he says. These guardrails will also be extended to the open source Apache Cassandra project, Kennedy says.

There’s a tendency in open source projects to err on the side of flexibility, which breeds complexity, Kennedy adds. Instead of picking a reasonable default setting for a given parameter, it’s left to the user to configure it to his specific needs.

“After a decade or so of adding features that way, you get to a point where all that ultimate flexibility becomes intractable,” Kennedy says. “So what we’ve done is we’ve taken a long hard look at how all of our customers successfully run in production. We say, we know what the really important tunables are and what the unimportant ones are based on how we advise people to tune their clusters via support tickets.”

Astra takes the muss and fuss away from Cassandra, leaving a lean, mean, and highly scalable database machine. Everybody starts off with a single capacity unit, which is three nodes running in three availability zones (AZs). To expand a cluster, all the user has to do is open the dialog box and add more capacity units.

Much of the technical work involved in scaling Cassandra is handled under the covers with Astra, and scaling it up and down is simplified thanks to the use of containers and Cassandra’s Kubernetes operator. Eventually, Cassandra will offer more performance options, including tweaking the amount of CPU and memory available on the servers that host Astra, but for now there’s just the single option.

“There will always be hand-tuned options,” says DataStax Chief Product Officer Ed Anuff. “We’re not locking you in. We can guarantee you’ll have successful Cassandra for multiple avenues if you’ve got a unique case. But Astra is making it possible for the 80% cases of Cassandra usage.”

(Phonlamai Photo/Shutterstock)

In addition to simplifying life for administrators, DataStax is looking to capture the imagination of developers with Astra. By making Cassandra so easy to use, DataStax hopes that developers will avail themselves of Cassandra’s inherent scalability advantages when selecting databases for their new applications.

“Since we’ve been in beta, we’ve literally put thousands of developers through the learning course,” Anuff says. “The process has involved them going and using Astra as their mechanism for leveraging Cassandra, learning CQL, and learning NoSQL data modeling. We think this is going to have a big impact for companies that are trying to close those skills gaps around how they get developers up to speed on Cassandra — not just making it easier to use on any project, but also just widening the set of developers that value it.”

DataStax has some catching up to do if it hopes to catch MongoDB, which has long been a favored NoSQL database among developers. MongoDB has also had a compelling cloud story for the past several years as DataStax struggled to get its cloud service up and running. But MongoDB can’t compete with Cassandra when it comes to large distributed clusters, and this is where DataStax hopes to make its mark with Astra.

The public cloud vendors are also trying to get in on the Cassandra-in-the-cloud game. Last year at re:Invent, AWS unveiled a beta of Managed Cassandra Service, ostensibly the world’s first “serverless”  Cassandra service. AWS changed the name to Amazon Keyspaces when it became generally available in April.

However, while Amazon Keyspaces supports the Cassandra Query Language (CQL) and Cassandra APIs, it’s actually built upon DynamoDB, a key value and document database, not the actual Cassandra database. Microsoft also has something similar with Azure Cosmos DB Cassandra API; it offers a compatible API for developers, but the database underneath isn’t as scalable as Cassandra.

“Those approaches will work for some folks,” Anuff says. “It shows there’s a huge amount of demand, that there’s a whole bunch of people who want Cassandra in the cloud. But the users that we’re trying to serve, they’ve been building applications against Cassandra. They want something that behaves like Cassandra, that’s compatible, where the data modeling behaves the same way. They want a story around portability. Those are the folks we’re trying to serve.”

The majority of Cassandra instances are already running in the cloud, albeit not managed by DataStax. Organizations can treat their new Astra clusters as extensions of their existing Cassandra clusters, which will simplify migration, Anuff says.

Astra is available on Google Cloud through the GCP console. It’s also available on Google Cloud and AWS through

Related Items:

AWS Launches Cassandra Service

DataStax Unveils Constellation, Its Cassandra Cloud Platform

When – and When Not – to Use Open Source Apache Cassandra, Kafka, Spark and Elasticsearch