April 26, 2025
Conference Paper

Shifting Between Compute and Memory Bounds: A Compression-Enabled Roofline Model

Abstract

In the evolving landscape of high-performance computing, especially to fight the end of Moore’s Law and Dennard’s Scaling, the ability to shift between compute-bound and memory-bound states is critical for enhancing adaptability and flexibility to diverse system and domain-specific architectures. Such capability is vital for optimizing performance across distinguished hardware configurations, such as accelerators, memory hierarchies, and cache systems. Despite that ad hoc optimization techniques, such as compressed/approximate computation, have been enabled for compute-/data-intensive computing for improved performance in distinct hardware settings, there lacks an understanding of 1) the rational behind performance improvement; 2) capability of different optimizations; 3) what optimization to respond to specific computational and memory demands. This work proposes a compression-enabled roofline model to facilitate this adaptability with data compression techniques to balance and transform between computational and memory demands. This model enables applications to adjust in response to the specific strengths and limitations of the underlying hardware and system to optimize resource utilization. The effectiveness of this approach is demonstrated with matrix multiplication kernels on different input sizes, with turning on/off various compression techniques, including 1) low-precision floating point; 2) sparse matrix formulation; and 3) compressed arrays with ZFP. By reducing memory transfer volumes and cache misses and increasing data locality and computational intensity through compression, the specific roofline model can transform between compute and memory bounds to align more efficiently with system capabilities. This advancement not only improves overall performance but also maximizes adaptability in diverse computing environments.

Published: April 26, 2025

Citation

Naraparaju R., T. Zhao, Y. Hu, D. Zhao, L. Guo, and N.R. Tallent. 2024. Shifting Between Compute and Memory Bounds: A Compression-Enabled Roofline Model. In SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, November 17-22, 2024, Atlanta, GA, 309-316. Piscataway, New Jersey:IEEE. PNNL-SA-203969. doi:10.1109/SCW63240.2024.00047

Research topics