Follow Datanami:
April 26, 2016

Neo4j Pushes Graph DB Limits Past a Quadrillion Nodes

A graph database with a quadrillion nodes? Such a monstrous entity is beyond the scope of what technologist are trying to do now. But with the latest release of the Neo4j database from Neo Technology, such a graph is theoretically possible.

There is effectively no limit to the sizes of graphs that people can run with Neo4j 3.0, which was announced today, says Neo Vice President of Products Philip Rathle.

“Before Neo4j 3.0, graph sizes were limited to tens of billions of records,” Rathle says. “Even though they may not have tens of billions of data items to actually store in a graph, just having a ceiling made them nervous.”

By adopting dynamically sized pointers, Neo4j can now scale up to run the biggest graph workloads that customers can throw at it. The company expects some of its customers will begin to put that extra capacity to use, for things such as crunching IoT data, identifying fraud, and generating product recommendations.


Neo4j’s logo sheet

“We actually have some customers who genuinely needed to get into the hundreds of billions [of nodes, relationships, and properties] and even beyond,” Rathle says. “There’s effectively no upper limit. The upper limit is in the quadrillions, so more than hundreds of trillions, and I don’t think there are any graphs in the world that are that large. Facebook’s largest graph is in the single-digit trillions.”

Half of the top 10 retailers in the world are using Neo4j for generating product recommendations, Rathle says. And while these customers aren’t hitting the limits of the database yet, the newfound scalability in version 3 will give them the confidence to stick with Neo4j as their database expands.

When it comes to product recommendations, retailers can get very good results using a smaller amount of data in a graph, compared to using a lot of data with a non-graph approach, Rathle says. “Then of course the more data you have, the finer your recommendations and the higher the quality is,” he adds.

The extra overhead in Neo4j 3.0 will give retailers the capability to move away from basing recommendations just on items that people put into the cart and what they search for, and instead include “every single click they’ve ever done,” Rathle says. “It will be interesting to see how companies use these new higher limits.”

Neo4j 3.0 brings several other notable enhancements, including a new binary wire protocol called Bolt that should make it easier to develop applications on the graph database. Bolt, in combination with a series of new drivers for JavaScript, Java, .NET, and Python, will provide a typesafe environment for developers that eliminates uncertainty that previously existed when passing JSON data over REST.

Neo_3_drivers“Having official language driver is actually a big deal for us,” Rathle says. “One big impact of Bolt plus the new official language drivers is they work in combinations with Cipher [Neo4j’s development language], which works with the storage engine, to carry through your data end to end in a typesafe way across the whole stack, and that allows for  more convenient development.”

The new version also adds support for Java stored procedures, which will make it easier to build complex workflows using pre-built components in a reusable fashion. The company says Java stored procedures functions as a “Swiss army knife” the that allow developers to quickly accomplish certain tasks, such as accessing and loading data from a separate database via JDBC or inferring schema from a data.

Neo4j 3.0 also brings a companion cloud service called Neo4j Browser Sync that’s designed to make developers’ jobs easier by storing and synchronizing commonly accessed scripts, settings, and graph style sheets. The companion cloud is automatically set to work with the Neo4j development tool, called the Neo4j browser, and can pull existing online credentials from GetHub ID.

Related Items:

Inside the Panama Papers: How Cloud Analytics Made It All Possible

Harnessing GPUs Delivers a Big Speedup for Graph Analytics