PIMS: A Lightweight Processing-in-Memory Accelerator for Stencil Computations

November 1, 2019

Conference Paper

PIMS: A Lightweight Processing-in-Memory Accelerator for Stencil Computations

Abstract

Stencil computation is a classic computational kernel present in many high-performance scientific applications, like image process- ing and partial differential equation solvers (PDE). A stencil compu- tation sweeps over a multi-dimensional grid and repeatedly updates values associated with points using the values from neighboring points. Stencil computations often employ large datasets that ex- ceed cache capacity, leading to excessive accesses to the memory subsystem. As such, 3D stencil computations on large grid sizes are memory-bound. In this paper we present PIMS, an in-memory accelerator for stencil computations. PIMS, implemented in the logic layer of a 3D- stacked memory, exploits the high bandwidth provided by through- silicon vias to reduce redundant memory traffic. Our comprehensive evaluation using three different grid sizes with six categories of orders indicate that the proposed architecture reduces 48.25% of data movement on average and obtains up to 65.55% of bank conflict reduction.

Revised: December 18, 2019 | Published: November 1, 2019

Citation

Li J., W. Xi, A. Tumeo, B. Williams, J.D. Leidel, and Y. Chen. 2019. PIMS: A Lightweight Processing-in-Memory Accelerator for Stencil Computations. In The International Symposium on Memory Systems (MEMSYS 2019), September 30-October 3, 2019, Washington DC, 41-52. New York, New York:ACM. PNNL-SA-146976. doi:10.1145/3357526.3357550