A Synthesis Methodology for Intelligent Memory Interfaces in Accelerator Systems

September 4, 2025

Conference Paper

A Synthesis Methodology for Intelligent Memory Interfaces in Accelerator Systems

Abstract

Domain-specific systems improve the performance of a specific set of applications compared to general-purpose processing systems by deploying custom hardware accelerators. These hardware accelerators are generated using high-level synthesis (HLS) tools. The HLS tools enable a comprehensive design space exploration to optimize the compute performance of the generated accelerators. However, they often ignore the challenges of implementing the accelerators in a system-on-chip, particularly how the accelerators access memory. Our work introduces a buffering system design that improves accelerators' memory accesses by intelligently employing burst transactions to prefetch useful data from external memory to on-chip local buffers. Our design is dynamic, parametric, and transparent to the accelerators generated by HLS tools. We derive the buffering system parameters using appropriate compiler-based analysis passes and memory channel latency constraints. The proposed buffering system design results in, on average, 8.8x performance improvements while lowering memory channel utilization on average by 53.2% for a set of PolyBench kernels.

Published: September 4, 2025

Citation

Limaye A.M., N. Bohm Agostini, C. Barone, V.G. Castellana, M. Fiorito, F. Ferrandi, and A. Marquez, et al. 2025. A Synthesis Methodology for Intelligent Memory Interfaces in Accelerator Systems. In Proceedings of the 30th Asia and South Pacific Design Automation Conference (ASP-DAC 2025), January 20-23, 2025, Tokyo, Japan, 1016 - 1022. New York, New York:Association for Computing Machinery. PNNL-SA-181728. doi:10.1145/3658617.3697553

Research topics

High-Performance Computing

PNNL

A Synthesis Methodology for Intelligent Memory Interfaces in Accelerator Systems

Abstract

Citation

Research topics

An Early Investigation of the HHL Quantum Linear Solver for Scientific Applications

ScaWL: Scaling k-WL (Weisfeiler-Lehman) Algorithms in Memory and Performance on Shared and Distributed-Memory Systems

Decomposing a Compound Flood Event in an Urban Pacific Northwest Estuary: Primary Drivers and Projections for the Future