this paper, we have developed a new methodology that takes in consideration the access patterns from a single parallel actor (e.g. a thread), as well as, the access patterns of “grouped” parallel actors that share a resource (e.g. a distributed Level 3 cache). We start with a hierarchical tile code for our target machine and apply a series of transformations at the tile level to improve data residence in a given memory hierarchy level. The contribution of this paper includes (a) collaborative data restructuring for group reuse and (b) low overhead transformation technique to improve access pattern and bring closely connected data elements together. Preliminary results in a many core architecture, Tilera TileGX, shows promising improvements over optimized OpenMP code (up to 31% increase in GFLOPS) and over our own previous work on fine grained runtimes (up to 16%) for selected kernels
Revised: January 21, 2016 |
Published: August 24, 2015
Citation
Shrestha S., J.B. Manzano Franco, A. Marquez, S. Zuckerman, S. Song, and G.R. Gao. 2015.Gregarious Data Re-structuring in a Many Core Architecture. In IEEE 17th International Conference on High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conference on Embedded Software and Systems (ICESS), August 24-26, 2015, New York, 712-720. Piscataway, New Jersey:IEEE.PNNL-SA-110971.doi:10.1109/HPCC-CSS-ICESS.2015.291