Recent work on real-world graph analytics has sought to leverage the massive amount of parallelism offered by GPU devices, but challenges remain due to the inherent irregularity of graph algorithms and limitations in GPU-resident memory for storing large graphs. We present GraphReduce, a highly efficient and scalable GPU-based framework that operates on graphs that exceed the device’s internal memory capacity. GraphReduce adopts a combination of both edge- and vertex-centric implementations of the Gather-Apply-Scatter programming model and operates on multiple asynchronous GPU streams to fully exploit the high degrees of parallelism in GPUs with efficient graph data movement between the host and the device.
Revised: January 21, 2016 |
Published: September 30, 2015
Citation
Sengupta D., K. Agarwal, S. Song, and K. Schwan. 2015.GraphReduce: Large-Scale Graph Analytics on Accelerator-Based HPC Systems. In IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW 2015), May 25-29, 2016, Hyderabad, India, 604-609. Piscataway, New Jersey:IEEE.PNNL-SA-111320.doi:10.1109/IPDPSW.2015.16