June 7, 2015
Conference Paper

Locality-Driven Dynamic GPU Cache Bypassing

Abstract

This paper presents novel cache optimizations for massively parallel, throughput-oriented architectures like GPUs. Based on the reuse characteristics of GPU workloads, we propose a design that integrates such efficient locality filtering capability into the decoupled tag store of the existing L1 D-cache through simple and cost-effective hardware extensions.

Revised: July 13, 2015 | Published: June 7, 2015

Citation

Li C., S. Song, H. Dai, A. Sidelnik, S. Hari, and H. Zhou. 2015. Locality-Driven Dynamic GPU Cache Bypassing. In Proceedings of the 29th ACM on International Conference on Supercomputing (ICS 2015), June 8-11, 2015, Newport Beach, California, 66-77. New York, New York:ACM. PNNL-SA-109271. doi:10.1145/2751205.2751237