Follow Datanami:
October 31, 2013

Clustrix Offers Free Downloads of NewSQL Database

Alex Woodie

NewSQL vendor Clustrix today posted its scale-out relational database on the Web for anybody to download. The free Clustrix Community Edition won’t offer the scalability and high performance that Clustrix offers in its paid versions, but it could help boost sales of commercial editions, and help it gain traction among developers.

Clustrix Community Edition supports all of the features that are available in the standard edition of the software. This means it’s an ACID compliant, drop-in replacement for MySQL databases built on a shared-nothing architecture that supports automatic sharding and replication of data to multiple nodes in a cluster. The free edition also supports what Clustrix CEO Robin Purohit describes as its “secret sauce,” namely a parallel query engine that can execute complex queries against that distributed data at high speeds.

But Clustrix restricts the community edition to running on 12 cores or fewer. Clustrix Community Edition can run on a cluster of up to three computers equipped with four processor cores each. That’s enough capacity to allow interested parties to try out the software, but not really enough to run a production application on.

The company also offers free trials of its standard edition, which has no limitations on cluster size or cores, but which expires after 45 days. Pricing for the standard edition (which is limited to 48 cores) starts at $1,200 per node per. Enterprise licenses are available for bigger systems.

The free software approach has the potential to be a savvy business move for Clustrix, which is one of a growing number of NewSQL and NoSQL database vendors looking to upset the status quo and capture a share of the growing market for more flexible relational database management systems. At seven years of age, the San Francisco-based company is actually one of the older and more established vendors in this space, which shows you how quickly it’s moving. It also has some established customers, such as AOL, Rakuten Global Markets (formerly Buy.com), and Symantec, although most of its customers are smaller outfits.

Today’s Clustrix announcement also marks the movement away from an appliance-based sales strategy to a software-dominated one. “We’ve always been a software company, but we released our initial product in an appliance because high performance computing wasn’t widely available yet,” Purohit tells Datanami in an interview. “High performance blades, flash storage, low latency interconnects–these things are becoming commodity now, but three years ago were not.”

Purohit and friends are champing at the bit to go up against Larry Ellison’s big red database machine located down the peninsula in Redwood City, and, to a lesser extent, the database businesses of IBM and Microsoft. While its relational database hasn’t proved itself in really big deployments yet, it offers more flexibility and lower cost.

Those “legacy” relational databases, he says, do a lousy job of scaling horizontally. Instead, most of them are designed to scale vertically. NewSQL vendors like Clustrix are banking on scale-out, clustered architectures that can expand capacity by readily absorbing new nodes while providing near-linear performance. This technique also provides business resiliency by absorbing a sudden loss of a node without compromising the application. It’s more difficult to expand capacity in the vertical approach, which relies on big symmetric multi-processing (SMP) servers, which only use clustering for business resilience.

Clustrix is also positioning its RDBMs to handle transactional and analytical workloads at the same time. Many businesses today–particularly the Web-based businesses that Clustrix is targeting–don’t want to run separate systems for transactional and analytical workloads, Purohit says. Thanks to the way it parallelizes and tunes queries, the ClustrixDB can handle both types of workloads while delivering acceptable performance for both.

Purohit says his customers don’t have the time to offload data into Hadoop and run queries with its batch-oriented MapReduce analytic engine. “I think there’s always a role for a secondary database for large-scale, over-time analytics,” Purohit says. “But if you can do most of it at the same time…we’re absolutely seeing the demand.”

Hadoop clearly offers capabilities that Clustrix can’t, including scalability into the hundreds of terabytes or petabyte range. (The biggest Clustrix database is 23 TB on a cluster of 168 nodes.) Clustrix also doesn’t support as many semi-structured and unstructured data types as Hadoop. While some are trying to remake Hadoop into a transactional data store layer, it’s still early going.

Clustrix is restricted to supporting traditional relational data types, according to a Gartner Magic Quadrant report on the operational database management system segment. The database’s performance, however, was said by Gartner to “rival the NoSQL DBMSs,” said Gartner, which also highlighted its capacity for self-management. Gartner placed Clustrix in the “niche players” quadrant, near NuoDB.

Clustrix is positioning itself to contend in the market for new data-driven websites, including e-commerce sites, dating websites, and social media sites. It has a strong focus on SQL and its wire compatibility with MySQL will attract disaffected Oracle customers. Its strict focus on standard relational data types may give more flexible NoSQl databases an edge, but its work to widen beyond transaction processing into more SQL-driven analytical workloads is worth watching.

You won’t find the broad data-type support and scalability of Hadoop, or the super-high speed analytics that columnar-oriented databases like EMC Greenplum or HP Vertica specialize in. But for users that are looking for a solid operational database that offers analytical capabilities and can support data stores up into the 10s of terabytes, Clustrix looks like it might be a solid bet.

Related Items:

NuoDB Takes the Wraps Off Blackbirds Database

MarkLogic Rolls Out the Red Carpet for Semantic Triples

Driving MapReduce into the Semantic Web

Datanami