The traditional database world is under more fire this week as the Cassandra Summit kicks off in San Francisco.
Oracle, the poster child for the traditional database, was the target of choice when Apache Cassandra database vendor, DataStax released news dissing the entrenched database giant. According to DataStax, three companies are stepping forward to publicly discuss their respective decisions to augment, and even (partly) scrap their traditional Oracle databases in favor of the Apache Cassandra NoSQL database platform.
Among the three stepping forward (Netflix, Ooyala, and Openwave Messaging), the most notable and well known is Netflix, the online movie service which according to reports has a customer base of 27.1 million U.S. subscribers using its online video services to stream to connected devices such as televisions, tablets, and mobile phones.
We spoke with Billy Bosworth, CEO of DataStax, who shared that Netflix is storing approximately 95% of their operational database in order to deliver their service to the millions of Netflix subscribers.
“The moment you log in – the login information, that’s in Cassandra,” boasted Bosworth explaining that Netflix now stores 95% of their operational data in Cassandra’s columns. “The immediate screen that you’re presented with – all those different boxes and categories – all that is being built from Cassandra calls.” He hit the checklist – the titles, the subscriber playlist, their history, their particular affinities, and even the tracking of where they are at in a movie so that it can be pick back up (on any device) where it was left off – all of it is being stored by Cassandra. The only information this is not being stored in Cassandra, said Bosworth, is the financial, backend office stuff, and the actual movies themselves.
While it’s no surprise that DataStax is boasting of customer success during a conference week dedicated to such a thing, what might surprise you is that this company – which barely existed three years ago (DataStax launched as “Riptano” in April 2010 with seed money from Rackspace) – say they have now penetrated 20 of the Fortune 100 companies with their database offering. It’s an impressive feat when you consider the fact that what is being talked about here is a database for large scale, mission-critical applications (such as what Netflix is offering).
It becomes even more curious when you consider a shocking fact: DataStax didn’t even have security built into their product until January of this year when they launched the DataStax Enterprise 3.0 version of their product.
“It sounds kind of embarrassing to say that, but we didn’t have really any good security at the database level,” Bosworth admits, saying that after they hit a wall on sales due to the security issue, they regrouped, re-prioritized, and finally launched a product complete with a suite of security features, including internal and external authentication, object permission management, data encryption, data auditing, and more.
“That opened up a whole new door of customers to us,” says Bosworth, noting that government, finance, and other vertical arenas where security is a chief concern, have started knocking on their doors. Bosworth says that they are up from the 27 customers that they had at the end of 2011 to over 300 current customers to date.
If anyone is surprised by these numbers, they’d be in good company. Bosworth admits that he’s a little taken back by them himself in reflecting on the penetration they’ve managed in the Fortune 100. “It’s faster than I thought, to be honest with you,” he says. “I really thought it would take another year before we started penetrating that level of account.”
With companies like Netflix coming out and openly calling Cassandra their “database of choice,” Bosworth credits their tight focus as a secret ingredient to their success to date. “We’ve stayed laser focused for a very long time on servicing an application,” he says. “Our mission in life is to be an online data store for an application team, to give them everything they need for their hot data in context – and that’s a very specific use case.”
Cassandra is architected with continuous availability as part of its DNA, which makes it an attractive option for servicing enterprise applications that need to be online and available all the time. Constructed using a master list architecture approach, the nodes in a Cassandra cluster are all fully distributed peers with no single node being a centralized master node – no single point of failure. While this is a strong selling point for some applications, it doesn’t come without having to pay the CAP Theorem piper.
In CAP Theorem, a database like HBase is a CP type system, prioritizing consistency and partition tolerance at the expense of availability. Cassandra, on the other hand, is an AP type system, which values availability and partition tolerance, sacrificing consistency. It’s a trade-off – while Cassandra’s nodes are available virtually all the time, they may not all be consistent with the most current information. This might not be a big deal where you’re dealing with where you left off while watching Breaking Bad on Netflix, but for financial applications where money is being added and subtracted to and from accounts in real time, that consistency across nodes can be critical.
Cassandra compensates and gives assurances with what they call “tunable consistency,” but even Bosworth himself would tell you that CAP Theorem workarounds aren’t going to be as pure as being loyal to the original use case for each respective database.
And there are plenty to choose from, as we’ve witnessed in the Hadoop space, where many of the distro vendors have started to roll their own database solutions to compete with NoSQL offerings like Cassandra. Virtually all of the major Hadoop distro vendors have integrated some form of HBase into their platforms, and competing databases like Couchbase and Mongo have their own claims to fame.
Bosworth says that while there is a lot of competition, they’re not standing still. The company has taken steps to get more aggressive in European markets by opening up a full subsidiary in London, and expanding their Cassandra Summit to Europe in 2013. The company currently employs over 100 people, and Bosworth says that they are in the process of doubling that to over 200 in the next 3 to 4 quarters.
Ultimately, Bosworth is confident that DataStax and Cassandra have solid footing in the burgeoning enterprise big data application landscape. “We don’t try and do 50 different things,” he muses. “We try and do our thing extremely well because that’s an environment that is not going to go away. That enterprise online application database is going to be a staple in all enterprises – just like it’s been for the last 20 years, it’s probably going to be for the next 20.”
Hadoop Sharks Smell Blood; Take Aim at Status Quo
Hadoop Distros Orbit Around Solr
The Transformational Role of the CIO in the New Era of Analytics