Traditional implementations of parallel graph operations on distributed memory platforms are written using Message Passing Interface (MPI) point to point communication primitives such as Send-Recv (blocking and nonblocking). Apart from this classical model, the MPI community has over the years added other communication models; however, their suitability for handling the irregular traffic workloads typical of graph operations remain comparatively less explored. Our aim in this paper is to study these relatively underutilized communication models of MPI for graph applications. More specifically, we evaluate MPI’s one-sided programming, or Remote Memory Access (RMA), and nearest neighborhood collectives using a process graph topology. There are features in these newer models that are intended to better map to irregular communication patterns, as exemplified in graph algorithms.
As a concrete application for our case study, we use distributed memory implementations of a weighted graph matching algorithm to investigate performances of MPI-3 RMA and neighborhood collective operations compared to nonblocking Send- Recv. A matching in a graph is a subset of edges such that no two matched edges are incident on the same vertex. A maximum weight matching is a matching of maximum weight computed as the sum of the weights of matched edges. Algorithms for optimal matching have quadratic complexity and are difficult to parallelize. More importantly, execution of graph matching is dominated by high volume of irregular memory accesses, making it an ideal candidate for studying the effects of various MPI communication models on graph applications at scale.
Our neighborhood collectives and RMA implementations yields up to 6× speedup over traditional optimized nonblocking Send-Recv implementations, on thousands of cores of the NERSC Cori supercomputer. We believe the lessons learned from this case study can be adopted to benefit a wider range of graph application codes.
Revised: November 1, 2019 |
Published: September 2, 2019
Citation
Ghosh S., M. Halappanavar, A. Kalyanaraman, M.H. Khan, and A. Gebremedhin. 2019.Exploring MPI Communication Models for Graph Applications Using Graph Matching as a Case Study. In IEEE 33rd International Parallel & Distributed Processing Symposium (IPDPS 2019), May 20-24, 2019, Rio de Janeiro, Brazil, 761-779. Los Alamitos, California:IEEE Computer Society.PNNL-SA-138882.doi:10.1109/IPDPS.2019.00085