Modern sensor networks for the power grid, urban pollution, sea monitoring, large scientific instruments, and all the modern devices at the “edge” that are part of the Internet-of-Things stream inordinate amounts of data. This much data cannot possibly be stored locally. These devices also need to be able to analyze, organize, and reduce the data quickly and efficiently. To do this, they execute processing applications that apply complex sequences of computational tasks to the streamed data. Only specialized accelerators can provide the processing performance required to achieve this with the very limited power budget of these edge devices.
Researchers from the Advanced Computing, Mathematics and Data (ACMD) Division at Pacific Northwest National Laboratory (PNNL) created DynPaC, a coarse-grained, dynamic, and partially reconfigurable array for streaming applications. The results of this research were published by the 2021 IEEE International Conference on Computer Design and received the “Best Paper Award.”
Because streaming applications may include diverse inputs for computers to handle, their operating systems need to be both flexible and high-performing.
Coarse-grained reconfigurable arrays (CGRAs) provide both while maintaining a high level of energy efficiency. CGRAs are used across many application domains, such as multimedia, high performance computing, and machine learning. CGRAs are typically programmed in one of two ways. They are either used in their entirety for a single kernel at a time and then reconfigured, or they are partitioned to allocate resources to all tasks at once.
However, data-dependent streaming can contain a variety of tasks that may take different amounts of time to process. If the CGRA is programmed using the first method, this can lead to latency issues and overall accelerator inefficiency. If the CGRA is programmed using the second method, some hardware resources may be underutilized as some tasks finish sooner than others.
DynPaC addresses these issues by allowing dynamic partial reconfiguration of computing resources based on the execution status of each task. This can increase computing speed by up to 1.44 times.
“Instead of offloading and accelerating a single task, DynPaC unleashes the full potential of novel CGRAs to allow the acceleration at the application level,” said Cheng Tan, lead author of the paper. “It dynamically reconfigures the CGRA fabric based on the execution status of each task. This rebalances the streaming application pipeline and improves the overall throughput.”
This novel system could make data streaming a lot more streamlined in the future.
Other authors of this research are Tong Geng, Chenhao Xie, Nicolas Bohm Agostini, Jiajia Li, Ang Li, Kevin Barker, and Antonino Tumeo from the ACMD at PNNL. This work was supported by the SO(DA)2 project under PNNL’s Data-Model Convergence (DMC) Laboratory Directed Research and Development Initiative.