Distributed multi-GPU systems pose significant challenges
and opportunities for efficient execution of parallel applications.
Graph algorithms are generally characterized by
irregular memory accesses, low computation to communication
ratios, and load balancing problems that are especially hard to
address on multi-GPU systems. Graph community detection is
an important problem in the emerging domain of graph analytics
with numerous applications. In this paper, we present our
ongoing work on distributed-memory multi-GPU implementation
for graph community detection. Our work parallelizes the widely
used (albeit serial) Louvain method on distributed multi-GPU
platforms. Supported by an extensive set of experiments on a
multi-GPU enabled supercomputer (OLCF Summit) and a single
compute node (Nvidia DGX-2®), we demonstrate competitive
performance to existing distributed-memory CPU-based implementation,
and up to 6.5 better results than Nvidia RAPIDS®
CUGRAPH. To the best of our knowledge, this work represents
the first effort for community detection on distributed multi-GPU
systems. Our approach and related findings can be extended
to numerous other iterative graph algorithms on multi-GPU
systems.
Published: March 16, 2022
Citation
Gawande N.A., S. Ghosh, M. Halappanavar, A. Tumeo, and A. Kalyanaraman. 2022.Towards Scaling Community Detection on Distributed-Memory Heterogeneous Systems.Parallel Computing 111.PNNL-SA-156736.doi:10.1016/j.parco.2022.102898