Follow Datanami:
December 15, 2015

Harnessing GPUs Delivers a Big Speedup for Graph Analytics

(Iaroslav Neliubov/

Looking for a quick way to boost the performance of your graph-based machine learning algorithms? If you run the algos atop the Blazegraph database, you will see your queries execute 200 to 300 times faster when you add the new GPU module unveiled today by SYSTAP, the company behind Blazegraph.

The new Blazegraph GPU offering will help to solve big queries in life sciences, financial detection, and cyber security that can bog down graph databases running atop a traditional server architecture. By tapping into high-speed memory bandwidth and parallelism of GPUs, SYSTAP says it can solve the “Graph Cache Thrash” problem that commonly afflicts those architectures.

Almost no changes to the existing graph database are required to use Blazegraph GPU, and customers can continue using their existing graph databases, such as RDF/SPARQL, Property Graph (Tinkerpop), and others.

It’s a new form of “drop-in GPU acceleration,” says Brad Bebee, founder and CEO of SYSTAP, which is based in Washington D.C. “This is what I would consider a classic GPU acceleration technology,” Bebee tells Datanami. “You have something you’re familiar with and already using. You add a GPU and all of a sudden, you’re able to get 200x to 300x query performance for your graph pattern-matching queries.”

This will be particularly helpful to organizations in the life sciences field. Some of these groups are using big graphs (on the order of 100 billion edges) to gain understanding of how diseases interact with DNA and to follow protein pathways to find new uses for existing drugs. “Life sciences has some very big problems,” Bebee says. “They have a certain need for being able to process more quickly and effectively.”

SYSTAP started developing Blazegraph GPU two years ago as part of a DARPA research project with University of Utah’s Scientific Computing and Imaging (SCI) Institute. The idea was to take the same type of GPU architecture used on the country’s largest supercomputer, Titan at the Oak Ridge Leadership Computing Facility, and apply it to a general GPU architecture that is more accessible to the broader community.

“We think it’s a real opportunity to democratize, if you will, these kinds of high performance computing analytics,” Bebee says. “Previously these were only available to companies that have large GPU clusters or could afford to do GPU things. Now it’s much easier to get access to GPUs, or you can get them at a much lower cost. Commercially we’re excited about bringing this kind of approach that previously was only available to more niche markets to broader perspective.”

Source: Blazegraph

Source: Blazegraph

It’s a natural fit to use GPUs to accelerate processing in a graph, says Bebee. “Because graphs represent relationships, they have a property called known locality, also known as data dependent parallelism, and that means you can’t necessarily know, based on the graph, how to lay it out so you can hit cache and memory effectively,” he says. “All the tricks and techniques that people use for more relational-oriented problems just don’t work effectively for graph.”

Tapping into the huge computational power resident in Nvidia Kepler K40 GPUs was a natural, Bebee says, because it allows him to exploit the superior bandwidth to main memory and effective parallelism of GPUs to solve big graph problems. That also eliminates the Graph Cache Thrash that occurs when a CPU is waiting to be served data from memory. The various L1, L2, and L3 caches on a chip serve data up to 5 times faster than RAM.

Bebee is bullish on the power of GPUs to deliver big power boosts for graphs. He cited an SYSTAP study that showed GPUs are dramatically more cost efficient for processing graphs than Hadoop and Cray architectures based on a metric known as the Graph Edge Traversal Performance (GTEP), which is equivalent to one billion traversed edges per second. An Accumulo graph database implemented on Hadoop cost about $18 million per GTEP, while a Cray XMT-2 came in at $180,000 per GTEP.

By comparison, one GTEP can be squeezed out of a $16,000 cluster of Nvidia Kepler K40s. And next year, when Nvidia ships its next-gen Pascal GPU architecture, the price-per-GTEP will drop to $4,000, Bebee says.

SYSTAP is arguably one of the first companies to utilize GPUs as graph analytic accelerators. The parallel programing wasn’t easy, and it can be tricky to ensure the communication is happening correctly. But Bebee thinks SYSTAP has a six to 12 month advantage on the rest of the market.BlazeGraph_logo

The company announced a second product that also will ship next year. Blazegraph DASL (pronounced “Dazzle”) is a new domain-specific language designed to allow analytic experts to write algorithms for large-scale machine learning and other complex applications that efficiently run on GPUs, without the need for parallel programming expertise or low-level device optimization.

“The goal of Blazegraph DASL is to be as easy to use as Spark and Scala for predictive analytics, but as fast as GPUs and CUDA,” Bebee says. “The people who really can write good predictive analytics want to write in a way that’s concise, simple, and smart. The people who can get on FPGAs and GPU and supercomputers and write blazing-fast code that uses parallel programing and multi-GPU communication patterns–there’s no overlap. They’re not the same people.”

If you haven’t heard of SYSTAP or Blazegraph, you’re not alone. While the company has been developing the open source graph database for nearly 10 years, it graph database previously languished under the name “BigData,” which may be an apropos description for what it does, but is a lousy differentiator. SYSTAP wisely adopted the name Blazegraph a year ago.

Related Items:

Inside Yahoo’s Super-Sized Deep Learning Cluster

How NVIDIA Is Unlocking the Potential of GPU-Powered Deep Learning

Neo4j Touts 10x Performance Boost of Graphs on IBM Power FPGAs