Increasing concern about the power consumption of data centers and computer laboratories, which in some cases matches or exceeds the resources required to power a small city, drive a need for a new, integrated approach to parallel performance diagnosis that integrates traditional application oriented performance data with measurements of the physical runtime environment. We have developed infrastructure for combined evaluation of system, application, and machine room performance in the high end environment. We motivate our approach, with a case study of the performance, power and cooling impact of the choice of physical location for a scientific application within the machine room. We present a new intensity metric for use in automated performance diagnosis tools, and discuss the challenges encountered.
Revised: February 19, 2016 |
Published: December 14, 2011
Citation
Knapp R., K. Karavanic, S. Krishnamoorthy, and A. Marquez. 2011.Power- and Cooling-Aware Parallel Performance Diagnosis. In Parallel and Distributed Computing and Systems (PDCS 2011), December 14-16, 2011, Dallas, Texas, Paper No. 757-114. Anaheim, California:ACTA Press.PNNL-SA-84331.doi:10.2316/P.2011.757-114