Computational efficiency – performance relative to power or energy – is one of the most important concerns when designing RADAR processing systems. This paper analyzes power and performance trade-offs for a typical Space Time Adaptive Processing (STAP) application. We study STAP implementations for CUDA and OpenMP on two computationally efficient architectures, Intel Haswell Core I7-4770TE and NVIDIA Kayla with a GK208 GPU. We analyze the power and performance of STAP’s computationally intensive kernels across the two hardware testbeds. We also show the impact and trade-offs of GPU optimization techniques. We show that data parallelism can be exploited for efficient implementation on the Haswell CPU architecture. The GPU architecture is able to process large size data sets without increase in power requirement. The use of shared memory has a significant impact on the power requirement for the GPU. A balance between the use of shared memory and main memory access leads to an improved performance in a typical STAP application.
Revised: September 24, 2015 |
Published: July 27, 2015
Citation
Gawande N.A., J.B. Manzano Franco, A. Tumeo, N.R. Tallent, D.J. Kerbyson, and A. Hoisie. 2015.Power and Performance Trade-offs for Space Time Adaptive Processing. In IEEE 20th International Conference on Application-specific Systems, Architectures and Processors (ASAP 2015), July 27-29, 2015, Toronto, Canada, 41-48. Piscataway, New Jersey:IEEE.PNNL-SA-110779.doi:10.1109/ASAP.2015.7245703