The rise of the accelerator-based architectures and reconfigurable computing have showcased the weakness of software stack toolchains that still maintain a static view of the hardware instead of relying on a symbiotic relationship between static (e.g., compilers) and dynamic tools (e.g., runtimes). In the past decades, this need has given rise to adaptive runtimes with increasingly finer computational tasks. These finer tasks help to take advantage of the hardware by switching out when a long latency operation is encountered (because of the deeper memory hierarchies and new memory technologies that might target streaming instead of random access), thus trading off idle time for unrelated work. Examples of these finer task runtimes are Asynchronous Many Task (AMT) runtimes, in which highly efficient computational graphs run on a variety of hardware. Due to its inherent latency tolerant characteristics, Latency-sensitive applications, such as Graph Analytics and Big Data can effectively use these runtimes. This paper aims to present an example of how the careful design of an AMT can exploit the hardware substrate when faced with high latency applications such as the ones given in the Big Data domain. Moreover, with its introspection and adaptive capabilities, we aim to show the power of these runtimes when facing the changing requirements of the application workloads. We use the Performance Open Community Runtime (P-OCR) as our vehicle to demonstrate the concepts presented here.
Published: August 1, 2021
Citation
Suetterlein J.D., J.B. Manzano Franco, A. Marquez, and G.R. Gao. 2020.On the Marriage of Asynchronous Many Task Runtimes and Big Data: A Glance. In Proceedings of the 27th International Conference on High Performance Computing, Data, and Analytics (HiPC 2020), December 16-19, 2020, Pune, India, 233-242. Piscataway, New Jersey:IEEE.PNNL-SA-157240.doi:10.1109/HiPC50609.2020.00037