April 27, 2022
Conference Paper

LC-MEMENTO: A Memory Model for Accelerated Architectures


With the advent of heterogeneous architectures, in particular, with the ubiquity of multi-GPU systems, it is becoming increasingly important to manage device memory efficiently in order to reap the benefits of the additional core count. To date, such responsibility mainly falls on the programmer where device-to-host data communication (and vice versa), if not done properly, may incur costly memory transfer operations and synchronization. The problem may be compounded by additional requirement to maintain system-wide memory consistency that may involve expensive synchronization overhead. In this paper, we present Location Consistency Memory Model for Enhanced Transfer Operations (LC-MEMENTO). This framework considers incorporating runtime techniques for multi-GPU memory management to support relaxed synchronization semantics and memory transfer operations automatically. Specifically, we implement a relaxed form of a memory consistency model based on the Location Consistency (LC) in an Asynchronous Many-Task Runtime (ARTS) and demonstrate that, this memory model enables additional optimization opportunities for the three representative applications encompassing different computational patterns (scientific computation, graphs, data streaming, etc.).

Published: April 27, 2022


Ranganath K., J.S. Firoz, J.D. Suetterlein, J.B. Manzano Franco, A. Marquez, M.V. Raugas, and D. Wong. 2022. LC-MEMENTO: A Memory Model for Accelerated Architectures. In The 34th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2021), October 13-14, 2021. Lecture Notes in Computer Science, edited by X. Li and S. Chandrasekaran, 13181, 67–82. PNNL-SA-166245. doi:10.1007/978-3-030-99372-6_5