Studying CPU and memory utilization of applications on Fujitsu A64FX and Nvidia Grace Superchip

January 7, 2025

Conference Paper

Studying CPU and memory utilization of applications on Fujitsu A64FX and Nvidia Grace Superchip

Abstract

ARM-based manycore CPU architectures are well-positioned to provide the rising memory throughput requirements of modern data intensive scientific applications in High Performance Computing (HPC). The Fujitsu A64FX CPU platform is based on the ARM v8.2A architecture, and is the processor of the flagship Japanese supercomputer - "Fugaku", which was previously ranked as the #1 supercomputer in the world according to the Top500 list. The Nvidia Grace superchip features 144 Neoverse V2 cores based on the ARMv9 architecture with 4x128b SVE2, providing exceptional computational power. The chip supports up to 480GB of memory, making it ideal for AI, machine learning, and scientific computing workloads. In this paper, we conduct a thorough performance exploration of a variety of parallel bandwidth-sensitive benchmarks and applications compiled with the native Fujitsu compiler on a Fugaku A64FX compute node and ARM (LLVM) Compiler on an NVIDIA Grace superchip compute node, engaging all the computational cores per cluster using OpenMP multithreading (assuming the cores can drive the available bandwidth). Our ultimate goals are to study the resource utilization of scientific applications and benchmarks on A64FX and Grace superchip, considering graph application scenarios ( GAP Benchmark suite) and eleven appli- cation proxies from the Rodinia heterogeneous benchmark suite (considering domains such as Data Mining, Bioinformatics, Fluid Dynamics, Pattern Recognition, etc.). Through exhaustive performance monitoring, we quantify the resource utilization of diverse OpenMP-based HPC applications on both the Fujitsu A64FX and the Nvidia Grace Superchip platforms.

Published: January 7, 2025

Citation

Kang Y., S. Ghosh, M. Kandemir, and A. Marquez. 2024. Studying CPU and memory utilization of applications on Fujitsu A64FX and Nvidia Grace Superchip. In Proceedings of the International Symposium on Memory Systems (MEMSYS 2024), September 30-October 3, 2024, Washington, D.C., 198 - 207. New York, New York:Association for Computing Machinery. PNNL-SA-203239. doi:10.1145/3695794.3695813

Research topics

High-Performance Computing

PNNL

Studying CPU and memory utilization of applications on Fujitsu A64FX and Nvidia Grace Superchip

Abstract

Citation

Research topics

PNNL Researchers Achieve High-Level Quality of Service

Software Defined Architectures for Portability and Performance

Energy-efficient Scientific Computing using Chemical Reservoirs