Why Graph Databases Are Becoming Part of Everyday Life
Zephyr Health, a San Francisco-based software company that offers a data analytics platform for pharmaceutical, biotech and medical device companies, wanted its customers to unlock more value from their data relationships. Doing so would enable pharmaceutical companies, for example, to find the right doctors for a clinical trial by understanding relationships among a complex mix of public and private data such as specialty, geography, and clinical trial history.
Old-school SQL databases were not up to the task. Traditional SQL databases don’t handle data relationships well, and most NoSQL databases don’t handle data relationships at all. Nor are they well-equipped to handle data that’s always changing – such as streams of new information coming in from doctor’s surveys.
Zephyr found the solution in a graph database, for its capability and scale. Graph databases are key to discovering, capturing, and making sense of complex interdependences and relationships, both for running an IT organization more effectively and for building next-generation functionality for businesses. They are designed to easily model and navigate networks of data, with extremely high performance. To fully appreciate the value of the graph, consider that early adopters of graph databases, such as Facebook and LinkedIn, became household names and unrivaled leaders in their sectors.
While SQL databases have been a mainstay in enterprise IT departments for decades, they have increasingly given way to NoSQL solutions as data volumes and connections boom. It’s important to keep a discerning eye on so-called NoSQL databases; the term can be annoyingly vague and is applied to wildly differing database types. Several NoSQL database categories have emerged, each tackling a distinct business problem: document, column array, key-value, and graphs. While the entire NoSQL sector is attracting increasing attention, graph databases are generating real and lasting excitement, with interest in the sector having grown 500% in the last two years alone! Forrester Research has reported that graph databases — the fastest-growing category in database management systems — will reach more than 25 percent of enterprises by 2017.
Graph databases are effective for every industry — from telecommunications to financial services, logistics, hospitality, and healthcare. Despite their market momentum, however, some people still consider graphs to be mysterious. In actuality, graph databases use natural and intuitive principles that bear much more similarity to tasks we perform on a daily basis, than do relational database management systems, which by comparison have a fairly steep learning curve. If you’ve ever worked out a route via a mass transit map or followed a family tree, you have manually run your own graph-based query.
In fact, you’ve likely come across a product or service powered by a graph database within the last few hours. Many everyday businesses have created new products and services and re-imagined existing ones by bringing data relationships to the fore. That’s because graph databases are the best way to model, store, and query both data and its relationships, which is crucial for next-generation applications that feature use cases such as real-time recommendations, graph-based search, and identity & access management.
For example, Walmart – which deals with almost 250 million customers weekly through its 11,000 stores across 27 countries and through its retail websites in 10 countries – wanted to understand the behavior and preferences of online buyers with enough speed and in enough depth to make real-time, personalized, ‘you may also like’ recommendations. By using a graph database, Walmart is able to connect masses of complex buyer and product data to gain insight into customer needs and product trends, very quickly.
Here’s how it works: The graph database stores and processes any kind of data by bringing relationships to the fore. A “graph” can be thought of like a whiteboard sketch: when you draw on a whiteboard with circles and lines, sketching out data, what you are drawing is a graph. Graph databases store and process data within the structure you’ve drawn, providing significant performance and ease-of-use advantages, plus unparalleled ease in evolving the data model. No other type of database does this. Because they are designed to do so, graph databases are becoming an essential tool in discovering, capturing, and making sense of intricate relationships and interdependencies.
The Seven Bridges Puzzle
Graphs theory, far from being a recent data handling development is actually nearly 300 years old and can be traced to Leonhard Euler, a Swiss mathematician. Euler was looking to solve an old riddle known as the “Seven Bridges of Königsberg.” Set on the Pregel River, the city of Königsberg included two large islands connected to each other and the mainland by seven bridges. The challenge was to map a route through the city that would cross each bridge only once while ending at the starting point. Euler realized that by reducing the problem to its basics, eliminating all features except landmasses and the bridges connecting them, he could develop a mathematical structure that proved the riddle impossible.
Today’s graphs are based entirely from Euler’s design – with land masses now referred to as a “node” (or “vertex”), while the bridges are the “links” (also known as ‘relationships” and “edges”). One thing that’s great about graph databases however is that their end users don’t need to know anything about graph theory in order to experience immediate practical benefits.
Graphs are a vital part of our online lives, powering everything from social media sites – including Twitter and Facebook – to the retail recommendations on eBay. Online dating also owes much of its success to the way graphs can analyze even the most complex relationships, looking not only at location and personal details but also passions, hobbies, and attitudes, and relationships between all of those things, to identify potential matches.
Enterprise efforts in fraud detection, master data management, and network and IT operations are vastly improving, thanks to relationship-based insight rooted in graph database usage.
Interest in the graph will continue to grow. The real-time nature of a graph database makes it an excellent platform for unlocking business value from data relationships, which simply can’t be carried out on traditional SQL or most NoSQL databases. The uses and applications for graph databases seem endless, and it’s exciting to consider what innovations they will continue to power as the world unlocks the value of data relationships.
About the author: Emil Eifrem is CEO of Neo Technology and co-founder of the Neo4j project. Before founding Neo, he was the CTO of Windh AB, where he headed the development of highly complex information architecture for enterprise content management systems. Committed to sustainable open source, he guides Neo along a balanced path between free availability and commercial reliability. Emil is a frequent conference speaker and author on NoSQL databases. His twitter handle is @emileifrem.