March 1, 2009
Journal Article

Topolgy Agnostic Hot-Spot Avoidance with InfiniBand

Abstract

InfiniBand has become a very popular interconnect, due to its advanced features and open standard. Large scale InfiniBand clusters are becoming very popular, as reflected by the TOP 500 supercomputer rankings. However, even with popular topologies like constant bi-section bandwidth Fat Tree, hot-spots may occur with InfiniBand, due to inappropriate configuration of network paths, presence of other jobs in the network and un-availability of adaptive routing. In this paper, we present a hot-spot avoidance layer (HSAL) for InfiniBand, which provides hot-spot avoidance using path bandwidth estimation and multi-pathing using LMC mechanism, without taking the network topology into account. We propose an adaptive striping policy with batch based striping and sorting approach, for efficient utilization of disjoint network paths. Integration of HSAL with MPI, the de facto programming model of clusters, shows promising results with collective communication primitives and MPI applications.

Revised: August 11, 2010 | Published: March 1, 2009

Citation

Vishnu A., M.J. Koop, A. Moody, A. Mamidala, S. Narravula, and D.K. Panda. 2009. Topolgy Agnostic Hot-Spot Avoidance with InfiniBand. Concurrency and Computation. Practice & Experience 21, no. 3:301-319. PNNL-SA-69491. doi:10.1002/cpe.1359