September 3, 2019
Conference Paper

MAC: Memory Access Coalescer for 3D-Stacked Memory

Abstract

Emerging data-intensive applications, such as graph analytics and data mining, exhibit irregular memory access patterns. Research has shown that with these memory-bound applications, traditional cache-based processor architectures, which exploit locality and regular patterns to mitigate the memory-wall issue, are inefficient. Meantime, novel 3D-stacked memory devices, such as Hybrid Mem- ory Cube (HMC) and High Bandwidth Memory (HBM), promise significant increases in bandwidth that appear extremely appealing for memory-bound applications. However, conventional memory interfaces designed for cache-based architectures and JEDEC DDR devices fit poorly with the 3D-stacked memory, which leads to significant under-utilization of the promised high bandwidth. As a response to these issues, in this paper we propose MAC (Memory Access Coalescer), a coalescing unit for the 3D-stacked memory. We discuss the design and implementation of MAC, in the context of a custom designed cache-less architecture targeted at data-intensive, irregular applications. Through a custom simulation infrastructure based on the RISC-V toolchain, we show that MAC achieves a coalescing efficiency of 52.85% on average. It improves the performance of the memory system by 60.73% on average for a large set of irregular workloads.

Revised: September 6, 2019 | Published: September 3, 2019

Citation

Wang X., A. Tumeo, J.D. Leidel, J. Li, and Y. Chen. 2019. MAC: Memory Access Coalescer for 3D-Stacked Memory. In Proceedings of the 48th International Conference on Parallel Processing (ICPP 2019), August 5-8, 2019, Kyoto, Japan, Article No. 2. New York, New York:ACM. PNNL-SA-144149. doi:10.1145/3337821.3337867