Datastax Seeks to Put NoSQL Clusters on Autopilot
Datastax today unveiled a new release of its Apache Cassandra-based NoSQL database that it claims can eliminate much of the manual work associated with managing NoSQL database clusters. The new capabilities in DataStax Enterprise version 3.2 effectively allow customers to put their databases on “autopilot,” the company claims.
NoSQL databases are great, supporters say, because they make it relatively easy to implement big fault-tolerant clusters that dynamically scale out on cheap X64 gear without resorting to complex data replication and sharding schemes that are typically required to maintain the fidelity and availability of data in the relational world. Combine that with the capability to support more diverse data types without breaking the schema, and NoSQL databases are like manna from big data heaven.
The reality is obviously a bit more complicated than that. NoSQL and SQL databases are often good at different things, although there’s a fair bit of overlap in the types of applications customers are running on them. But as far as management goes, you can’t just “set it and forget it” with a NoSQL database. You still need to manage the thing. It’s not all milk and honey.
Now that NoSQL databases are becoming fairly common, the NoSQL database vendors are addressing some of the pain points of day-to-day management. Vendors like Datastax are still working on the big new features (like support for Cassandra version 2, expected in 2014). But they’re also trying to make their wares responsible and mature members of the data center.
“We’re stepping up automation around the management side of the cluster,” says Datastax co-founder and vice president of customer solutions Matt Pfeil. “From a functionality aspect, we’re saying, how can we help business spend more of their time on their business, and less time focusing on technology.”
To that end, the San Mateo, California company today delivered two specific management functions around node repair and capacity planning. The services are the first of many the company plans under its Datastax Management Services brand.
The new repair service automates the execution of the standard Cassandra “repair” command, according to Pfeil. Repair is a common maintenance tasks that DBAs must perform on Cassandra clusters about once a week. It performs optimizations and ensures that nodes are kept in sync, which helps to ensure fault-tolerance.
|OpsCenter 4.0 makes it easier to manage DSE clusters|
“This runs it more efficiently,” Pfeil says. “It optimizes how it’s performed and it requires a lot less manual intervention. They don’t have to interact with it on every node, or schedule it to run it on every machine. They can just interact with one place through OpCenter, and have the entire cluster take care of itself.”
Similarly, the new capacity planning function in DSE 3.2 is designed to keep DBAs more informed about the state of performance of clusters, and alert them to the need to upgrade the cluster to maintain performance levels. The software does this by collecting and storing performance data, and enabling DBAs to forecast future needs based on analysis of historical trends.
Maintaining performance levels of DSE clusters is easier as a result of the wide adoption of solid state disks (SSDs), Pfeil says. “The majority of customers–maybe the super majority–are deploying SSDs versus spinning media. On spinning disk, most of the clusters are I/O bound. Now they’re more CPU-bound, which makes forecasting so much easier.”
As the NoSQL adoption needle moves from “hype” to “steady adoption,” NoSQL vendors like DataStax–which claims to have 20 of the Fortune 100 as paying customers–will need to add relatively boring management features to keep their big enterprise customers happy. Some of these DataStax customers have clusters with thousands of DSE nodes, which drives DataStax R&D efforts.
“If you have a lot of servers like that, the last thing you want is a lot of complexity to manage that,” Pfeil says. “And obviously you don’t want to have a giant team to manage that. You want it all to be automated.”
All DataStax Management Services will be managed through the OpsCenter. With the new OpsCenter 4.0, DataStax added new screens that provide “visual overviews” of the state of DSE clusters, as well as new bulk operation capabilities that allow DBAs to perform maintenance tasks simultaneously across all nodes in a cluster.
It’s also worth mentioning that Datastax also exposes its OpsCenter capabilities through a REST API, which enables all those enterprise customers who have invested in big systems monitoring and management products from IBM, HP, BMC, CA, or Microsoft to hook their new NoSQL stuff into the enterprise mix. It’s not glamorous, but it’s necessary when you’re trying to be a responsible member of the enterprise, as NoSQL most certainly is.