September 4, 2025
Conference Paper
A Synthesis Methodology for Intelligent Memory Interfaces in Accelerator Systems
Abstract
Domain-specific systems improve the performance of a specific set of applications compared to general-purpose processing systems by deploying custom hardware accelerators. These hardware accelerators are generated using high-level synthesis (HLS) tools. The HLS tools enable a comprehensive design space exploration to optimize the compute performance of the generated accelerators. However, they often ignore the challenges of implementing the accelerators in a system-on-chip, particularly how the accelerators access memory. Our work introduces a buffering system design that improves accelerators' memory accesses by intelligently employing burst transactions to prefetch useful data from external memory to on-chip local buffers. Our design is dynamic, parametric, and transparent to the accelerators generated by HLS tools. We derive the buffering system parameters using appropriate compiler-based analysis passes and memory channel latency constraints. The proposed buffering system design results in, on average, 8.8x performance improvements while lowering memory channel utilization on average by 53.2% for a set of PolyBench kernels.Published: September 4, 2025