What SQL’s Co-Creator Sees in NoSQL
As the co-creator of Structured Query Language (SQL), Don Chamberlin knows a thing or two about pulling data out of relational databases. So when he spoke at the user conference for NoSQL database vendor Couchbase last week, it raised a few eyebrows.
Chamberlin is a computer scientist who developed SQL with Raymond Boyce back in the 1970s while working at IBM‘s Almaden Research center. As a member of the System R research team, also Chamberlin had a hand in developing much of IBM’s relational database technology. He was named an IBM Fellow in 2003 before retiring a few years ago.
During a one-on-one discussion with Couchbase CTO Ravi Mayuram at the vendor’s conference in San Jose last week, Chamberlin joked that he was happy with name NoSQL. “I think that when a language is so well recognized that other languages are defining themselves as not that one, you must be doing pretty good,” he quipped.
“But seriously, I don’t think any single language is the answer to all applications,” he continued. “I think SQL does a good job for the kinds of data it was designed for but other kinds of data have other requirements.”
A lot has changed in the data management world since Chamberlin developed SQL, and he recognized that newer data types are filling a need for more flexibility when it comes to storing and retrieving data from new applications.
“As you know, I spent most of my career in the relational database world,” Chamberlin said. “That’s a very predictable and simple world. Every table has a schema. Every row looks the same. All the values in the column have the same type, so it’s nice, and compilers are able to take advantage of all this type information for building a query platform.”
That level of predictability still exists for some applications, but it’s going by the wayside for many newer programs, including today’s popular Web and mobile apps. “Nowadays, as you know, there are a lot of applications that deal with data that doesn’t fit nicely into rows and columns, or … you might have to distribute it across multiple tables, and that create some problems of its own,” he said.
Chamberlin encountered this dynamic near the turn of the century, when he was involved with development of the XQuery language, which was developed to query XML. At first, he didn’t think SQL would play a role in accessing those data types at all. “We hear more about semi-structured data with formats like XML and JSON, and it’s reasonable to ask what kind of languages are adaptable to that new type of data,” he said. “And frankly it’s not obvious at first that SQL is the answer to that.”
But then he started hearing about new dialects of SQL that were being designed specifically for querying JSON data. “I heard about N1QL at Couchbase and there’s SQL++ at UC San Diego and the ASTERIX project at UC Irvine, and I scratched my head and said, what’s going on here?” he says. “JSON doesn’t look like tables. How can this be?”
You can sort of imagine that a JSON object is kind of like a row in a relational database, he says. And JSON has arrays, just like relational databases can have arrays. And an array of objects in JSON looks “sort of like a table” in a relational database, he said.
It’s not a one-to-one match, however, “There’s still some surprises in there, of course,” he said. “Not all the rows look the same as a table. And you can put tables inside other tables — that’s not relational.”
But there’s enough similarity between JSON and relational data that we can use some of the lessons we learned from the old SQL world and apply them to new NoSQL world, he says.
“The trick here is to extend the semantics to encompass these non-relational features while retaining as much of the familiar SQL syntax as you can,” Chamberlin says. “In fact, N1QL query operating on a JSON array of objects, looks and behaves pretty much like a SQL query on a table. But it’s not really easy to make that happen, because you don’t have the schema information that you would normally rely on.”
In the end, Chamberlin finds the NoSQL world has more similarities to the SQL world than might be obvious at first glance. “I think what people are trying to do is to preserve the investment in SQL while relaxing some of the constraints of the relational database model,” he says.