GraphIt Promises Big Speedup in Graph Processing
Developers writing graph applications that scale into the billions of vertices and trillions and edges are forced to make trade-offs that impact the performance of the application. But thanks to new graph language emerging from at MIT’s Computer Science & Artificial Intelligence Laboratory (CSAIL) called GraphIt, those trade-offs may not hurt as much in the future.
Graphs have become intrinsic components of our computing world. We may not be aware of them, but we’re interacting with graphs every time we search for something Google, seek a new friend on Facebook, or hail a ride on Uber.
While graphs allow us to access connected data much more quickly than by organizing data into rows and columns, they introduce their own collection of technical challenges. Researchers have been trying to overcome these challenges through various novel creations, such as the new GraphIt language developed at MIT.
According to a paper on GraphIt published in October, the new domain specific language (DSL) takes a new approach to solving graph processing challenges by separating what is computed (the algorithm) from how it is computed (the schedule).
“Programmers specify the algorithm using an algorithm language, and performance optimizations are specified using a separate scheduling language,” writes Yunming Zhang and his MIT CSAIL and Adobe Research colleagues in the paper, titled “GraphIt: A High-Performance Graph DSL” (you can download the paper here).
“The algorithm language simplifies expressing the algorithms, while exposing opportunities for optimizations,” the authors write. “We formulate graph optimizations, including edge traversal direction, data layout, parallelization, cache, NUMA, and kernel fusion optimizations, as tradeoffs among locality, parallelism, and work-efficiency.”
The scheduling language, meanwhile, lets programmers “easily search through this complicated tradeoff space by composing together a large set of edge traversal, vertex data layout, and program structure optimizations,” the authors continue. “The separation of algorithm and schedule also enables us to build an autotuner on top of GraphIt to automatically find high-performance schedules.”
The end result is a dramatic increase in performance of graph applications, according to the researchers. To prove it, the researchers ran GraphIt against six other shared-memory graph frameworks, including Ligra, Green-Marl, GraphMat, Galois, Gemini, and Grazelle. According to the researchers, GraphIt outperformed the next fastest of the frameworks on 24 out of 32 experiments by up to 4.8X. On the experiments where it was not the fastest, GraphIt was never more than 43% slower than the fastest framework.
In addition to the GraphIt DSL, the researchers describe a graph iteration space, which it describes as “a model [that] simplifies the design of the compiler by representing different combinations of optimizations as multi-dimensional vectors.” By compiling an encoding of this graph iteration space with the analysis of the algorithmic language, the software is able to generate “efficient and valid implementations” for different types of optimizations, the researchers write.
Zhang co-wrote the paper with MIT professors Saman Amarasinghe and Julian Shun, MIT undergraduate Mengjiao Yang, MIT CSAIL postdoc Riyadh Baghdadi, and Shoaib Kamil of Adobe Research.
The researchers are now working to extend the compiler to support more optimizations and hardware platforms, such as GPU and distributed memory. More information, including a video of Zhang’s presentation at a recent Association for Computing Machinery’s SIGPLAN, can be found at http://graphit-lang.org/.