RDBMs: The Hot New Technology of 2014?
Selecting a database is a bit like picking a religion. There are pros and cons to each, and the discussions run hot and heavy. With the rise of the big data phenomenon, adherents to the NoSQL and NewSQL database camps have dominated the discussion. But the relational camp has worked hard to catch up in terms of technology, and now threatens to swing the pendulum the other way.
|some nosql database vendors|
The rise of NoSQL databases over the last several years closely mirrors the rise in social media and mobile technologies. As people started generating more information on the Web, developers needed suitable backend repositories to store this huge influx of data.
The standard SQL-based relational database, having evolved over the last several decades to handle big transaction processing workloads that demanded a very high level of reliability and consistency, was mostly ill-suited for this task. For starters, the RDBMs didn’t support the right file formats. Also, the rigid schemas inherent in most RDBMs also exacted a high toll whenever new data types were added. And the vertical scalability of RDBMs was closely tied to big SMP boxes, which is anathema to the horizontally distributed architectures preferred by big website operators.
This functionality gap proved a fertile spot for new types of database management systems to take root and evolve. This gave rise to a new breed of NoSQL databases, including key-value stores like DynanmoDB, document stores like MongoDB, column-oriented stores like HBase, and graph databases like Allegro. At the same time, the rise of Hadoop as a big data platform where some of these databases can run fueled the perception that E.F Codd’s relational database was a thing of the past, and not a suitable place to run the big data workloads of the bright new world.
Not so fast, say proponents of relational database. Ditching the SQL part of relational databases is like throwing out the baby with the bathwater, they say. As the lingua franca of business, SQL is simply too embedded and vital–and just too incredibly useful–to completely ditch in favor of new and relatively unproven access methods. Ditto to the relational part of the RDBMs.
This line of reasoning was bolstered when Facebook decided earlier this year that SQL and RDBMs were, in fact, critical and necessary parts of its social media empire. “There really was an attitude internally, fed by what is going on in the industry as well, that relational has had its day in the sun and everything now needs to move to Hadoop,” Facebook analytics boss Ken Rudin told Enterprise Tech, adding that he considers himself a “born again SQL fan.”
Now, RDBMs vendors are striking back, hoping to regain some of ground they lost–if not in terms of actual market share (which never really took a dip) then in terms of mind share and people’s perceptions.
Go Big or Go Fast
Ryan Betts, the CTO of NewSQL database vendor VoltDB, says organizations are finding that NoSQL technology is well-suited for some types of big data applications, but ill-suited for others. “Some of the places we see people trying to use NoSQL and really struggling are these real-time decision, real-time analytics problems,” he tells Datanami.
“They’ve tried to apply the NoSQL technology to that, and that’s a mistake. NoSQL is fundamentally not about making real-time decision making. It’s all about storing vast amounts of data,” Betts says. “The sacrifices you make in a NoSQL system around transactions and ease of use and consistency–all of those sacrifices are made to allow you to store vast amounts of information. When vast amounts of information is your goal, they’re great sacrifices to make. But when your goal is making transactional decision in real-time against incoming streams of data, they’re absolutely the wrong sacrifices to make.”
To be sure, organizations should use the right technology for every application. There’s no such thing as a magic hammer that can handle all nails and screws. NewSQL vendors like NuoDB and FoundationDB and NoSQL database vendor MarkLogic acknowledged the problems around strong consistency and support for the ACID precepts that have driven RDBMs development for decades. Betts point is that, as organizations move to adopt Web applications that are more complex and more interactive, then the need to be able to react and transact on fast-moving data in a push model becomes critical.
|some newsql database vendors|
The advantages that NoSQL databases held in terms of support for unstructured data types is melting away, Betts says. “Relational database have become very good at storing JSON documents or other semi-structured data. The idea that you don’t know the structure of your data, and therefore you should give up relational systems, it turns out to be a false premise,” he says.
Similarly, there’s nothing inherent in SQL or NoSQL databases about a horizontally distributed architecture. “That’s more about the design of a distributed system, the design of the replication mechanism, and all those things can be done in conjunction with the relational system,” he says. “The relational systems have been very good at learning the lessons that these NoSQL systems presented…. The two advantages they had several years ago ….have been taken away.”
Don’t Believe the Hype
Ed Boyajian, the CEO at EnterpriseDB, the commercial entity behind the open source PostgreSQL database, says there has been too much attention paid to NoSQL database vendors, who he says mostly operate on the “fringe.”
“I think a lot of IT execs get kind of aroused by the interest in these new technologies, whether or not they meet the majority of needs of the data center,” Boyajian tells Datanami. “When we look at the landscape of apps that an enterprise runs, we still see the preponderance of need around transactional relational systems. While new apps and workloads for big data are emerging, we still think those are the edge use cases in the data center today.”
|some traditional sql database vendors|
PostgreSQL has been retrofitted to handle some of the common big data use cases. It supports JSON for semi-structured Web-based documents, and supports fast data lookups through HStore, its key-value data add-on. In 2014, the company will be looking to bolster its high availability with support for multi-master database writes, which will further eat into the advantage that NoSQL databases once held in the area of distributed architecture (although PostgreSQL is still architected mostly on a vertical scale-up model).
The vast repository of SQL skills, not to mention existing business intelligence tools built on SQL, further validate the intelligence of sticking with SQL and relational technologies, instead of ditching it for unproven NoSQL technologies, Boyajian says.
“The question now is at what scale do we provide insight, what’s the next level?” he asks. “Is it really more about the data analytic than the data storage technology. I’d argue that the really interesting guys in this conversation are the data scientists, the guys who actually write the algorithms who gather insight from all that data.”
While new apps and workloads for big data are emerging, those are edge cases and haven’t gone mainstream, he says. “The overemphasis on the fringe use cases distracts from the power of relational technology and skills that people have in the ecosystem and how far they’ve evolved,” Boyajian says.
Undoubtedly, the main benefactors of the $30-billion RDBMs business–Oracle, IBM, and Microsoft–would agree that there’s too much hype around NoSQL and not to be hasty with your decision to end maintenance contracts. (Think of Larry Ellison’s slip fees, after all.) The NoSQL-NewSQL database business is on pace to be 10 percent of the RDBMs in a few years. It’s not there yet, and there will likely be a lot of consolidation around the various database types, including key-value stores document stores, column-oriented stores, graph databases, and the various hybrids. In the meantime, relational systems will continue to power most of the real world through 2014.