Tiling is a critical loop transformation for generating high-performance code on modern architectures. Efficient generation of multilevel tiled code is essential to exploit several levels of parallelism and/or to maximize data reuse in deep memory hierarchies. Tiled loops with parameterized tile sizes (not compile time constants) facilitate runtime feedback and dynamic optimizations used in iterative compilation and automatic tuning. The existing parametric multilevel tiling approach has focused on transformation for perfectly nested loops, where all assignment statements are contained inside the innermost loop of a loop nest. Previous solutions to tiling for imperfect loop nests are limited to the case where tile sizes are fixed. In this paper, we present an approach to parameterized multilevel tiling for imperfectly nested loops. Our tiling algorithm generates loops that iterate over full rectangular tiles that are amenable for potential compiler optimizations such as register tiling. Experimental results using a number of computational benchmarks demonstrate the effectiveness of our tiling approach.
Revised: August 27, 2010 |
Published: May 18, 2009
Citation
Hartono A., M.M. Baskaran, C. Bastoul, A. Cohen, S. Krishnamoorthy, B. Norris, and J. Ramanujam, et al. 2009.Parametric Multi-Level Tiling of Imperfectly Nested Loops. In Proceedings of the 23rd International Conference on Supercomputing, 147-157. New York, New York:Association for Computing Machinery.PNNL-SA-65872.doi:10.1145/1542275.1542301