Why Couchbase Is Adding SQL To Its NoSQL Database
Couchbase’s SQL query language, called N1QL, is now ready for business in the beta of Couchbase Server 4.0, the company announced today. While Couchbase isn’t giving up on NoSQL concepts or architectures, it is counting on the expanded set of capabilities that a SQL query language (along with a distributed query engine) and can bring to its document-oriented database.
N1QL (pronounced “nickel”) is a SQL 92-compatible query language and query engine that Couchbase developed specifically for querying JSON documents within its NoSQL database. JSON, you’re probably aware, has flowered into the lingua franca of the modern Web, and document-oriented databases like those from Couchbase and MongoDB are thriving thanks to this extensible new way of sharing and organizing data.
Couchbase’s new technology gives developers a familiar way to build applications that sit atop its database, besides using the traditional methods, which involve using APIs, software developer kits (SDKs), and other technically complex methods. What’s more, N1QL will allow business analysts who can write SQL queries, as well as business intelligence tools that generate SQL queries, to access data stored in Couchbase. While Couchbase is still selling itself as a transactional database, the BI connection gives it a better story to tell in the operational analytics department.
Why is SQL so important for Couchbase? And what does the adoption of SQL technology mean for a company that has built its reputation on the NoSQL movement? Datanami got some solid answers to these questions from Ravi Mayuram, the senior vice president of products and engineering for Couchbase.
For starters, the name “NoSQL” itself a misnomer, says Mayuram, who is the main architect behind Couchbase’s distributed database. In the same way that a navigational error made in the 15th century by Christopher Columbus still leads us to call native Americans “Indians” five hundred years later, the popularization of the NoSQL name has dragged along a lot of conceptual detritus that is less than helpful in understanding the most important points about the class of databases commonly called “NoSQL,” he says.
“The problem was never with SQL,” Mayuram says. “It was with relational constraints that relational technologies have, which has to do with schema rigidity. So the dynamism of how the applications are being developed has changed, and the impedance that the older technologies are reinforcing in terms of having to change schema every time you change an application–that was [what NoSQL databases were moving away from], not from a query language.”
The creators of relational database technologies seized upon SQL as the main way to access data stored in rows and columns, not because that was the only method available, but because SQL itself is so powerful, simple, and complete. SQL and relational databases have been used together frequently, but they’re not tied to one another.
“SQL has been around for 40 years, and many other query languages have come and gone. But this one has stayed simply because it has two great characteristics,” Mayuram says. “First, it’s in English and simply expressible. Second, it has a good mathematical foundation underneath, which gives it completeness.”
SQL’s mathematical model empowers developers because they can express practically anything they want, Mayuram says. “SQL has relational algebra behind it, and N1QL is based on that,” he says. “The one in N1QL stands for non-first normal form relational algebra, so it has that mathematical completeness to it. So you know you’re not up the creek when you’re using a system like this–it will allow you to do everything you want when you’re working with a large collection of data.”
N1QL has been in development for about two years, and it’s taken a considerable effort to overcome multiple complexities, Mayuram says. For starters, Couchbase needed to invent a way for SQL to work on a different sort of data structure—JSON documents–as opposed to the rows and columns of data that compose a relational database.
“If you’ve seen a JSON document, it’s a lot of squiggly faces and objects nested within each other. The right way of saying it is it’s a query language that works on nested recursive documents,” he says. “We do something called nesting and un-nesting. It’s basically about taking something that’s nested and hierarchical, and flattening it into a rectangular tables and columns…..These are like joins. These are great capabilities that we’ve added, which is sort of extending the SQL syntax. There’s a lot of innovation we’ve done.”
Secondly, Couchbase needed its N1QL query engine to run in a distributed manner, across many different nodes, as opposed to the scale-up SMP servers that house big relational databases. “Knowing that the data is going to be distributed across a horizontal set of machine and how to paralyze the queries and when to serialize and when to parallelize and how to bring the data together–there’s a lot of innovation there too,” Mayuram says.
Couchbase CEO Bob Wiederhold is looking forward to having his customers get a hold of N1QL, which should make it much easier to access and manipulate data. While he isn’t ready to relinquish the NoSQL name just yet, he does expect his competitors to add their own SQL capabilities in the future.
“We’re surprised we’ll be the first and only NoSQL database that has a SQL-based query language,” Wiederhold tells Datanami. “In general, we’d encourage other NoSQL vendors to move to a SQL-based query language. Obviously that’s already happened in the Hadoop space. Everyone has their SQL-based query language. We have ours, and hope other NoSQL vendors have theirs.”