October 26, 2020
Conference Paper

Rapid Memory Footprint Access Diagnostics

Abstract

Footprint and reuse distance measure temporal locality and therefore do not capture the significance of access patterns (spacial locality). A strided access pattern has the largest possible footprint but usually has the best performance. To highlight exposed memory latency, we separate footprint into strided (prefetchable) and irregular (non-prefetchable) access components and calculate the growth rate of each. To rapidly compute these footprint access diagnostics, we present two methods, whole-program and precise. Current footprint analyses can cause 200× or more slowdown with realistic inputs and are therefore impractical. Our whole-program method reduces the overhead to 10% by computing upper bounds, but still yields inter-procedural insight through a call path profile. Our precise method uses additional static analysis and profiling to refine the upper bounds for intra-procedural loop nests. We evaluate our approaches using benchmarks that vary access patterns (strided \vs unpredictable), sparsity (all words in a cache line \vs some), and reuse (varying and repeated accesses per element). Notably, for loop nests with unpredictable accesses, the precise method's accuracy is within 10% of ideal. The whole-program method has sufficient accuracy to diagnose bottlenecks.

Revised: December 14, 2020 | Published: October 26, 2020

Citation

Kilic O.O., N.R. Tallent, and R.D. Friese. 2020. Rapid Memory Footprint Access Diagnostics. In 2020 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2020), August 23-25, 2020, Boston, MA, 273-284. Piscataway, New Jersey:IEEE. PNNL-SA-151215. doi:10.1109/ISPASS48437.2020.00047