December 22, 2020
Conference Paper

TriC: Distributed-memory Triangle Counting by Exploiting the Graph Structure

Abstract

Graph analytics has emerged as an important tool in the analysis of large scale data from diverse application domains such as social networks, cyber security and bioinformatics. Counting the number of triangles in a graph is a fundamental kernel with several applications such as detecting the community structure of a graph or in identifying important vertices in a graph. The ubiquity of massive datasets is driving the need to scale graph analytics on parallel systems. However, numerous challenges exist in efficiently parallelizing graph algorithms, especially on distributed-memory systems. Irregular memory accesses and communication patterns, low computation to communication ratios, and the need for frequent synchronization are some of the leading challenges. In this paper, we present TriC, our distributed-memory implementation of triangle counting in graphs using the Message Passing Interface (MPI), as a submission to the 2020 GraphChallenge competition. Using a set of synthetic and real-world inputs from the challenge, we demonstrate a speedup of up to 90x relative to previous work on 32 processor-cores of a NERSC Cori node. We also provide details from distributed runs with up to8192 processes along with strong scaling results. The observations presented in this work provide an understanding of the system-level bottlenecks at scale that specifically impact sparse-irregular workloads and will therefore benefit other efforts to parallelize graph algorithms.

Revised: January 29, 2021 | Published: December 22, 2020

Citation

Halappanavar M., and S. Ghosh. 2020. TriC: Distributed-memory Triangle Counting by Exploiting the Graph Structure. In IEEE High Performance Extreme Computing Conference (HPEC 2020), September 22-24, 2020, Waltham, MA, 1-6. Piscataway, New Jersey:IEEE. PNNL-SA-154840. doi:10.1109/HPEC43674.2020.9286167