May 1, 2015
Conference Paper

Jagged Tiling for Intra-tile Parallelism and Fine-Grain Multithreading

Abstract

In this paper, we have developed a novel methodology that takes into consideration multithreaded many-core designs to better utilize memory/processing resources and improve memory residence on tileable applications. It takes advantage of polyhedral analysis and transformation in the form of PLUTO, combined with a highly optimized finegrain tile runtime to exploit parallelism at all levels. The main contributions of this paper include the introduction of multi-hierarchical tiling techniques that increases intra tile parallelism; and a data-flow inspired runtime library that allows the expression of parallel tiles with an efficient synchronization registry. Our current implementation shows performance improvements on an Intel Xeon Phi board up to 32.25% against instances produced by state-of-the-art compiler frameworks for selected stencil applications.

Revised: January 26, 2016 | Published: May 1, 2015

Citation

Shrestha S., J.B. Manzano Franco, A. Marquez, J.T. Feo, and G.R. Gao. 2015. Jagged Tiling for Intra-tile Parallelism and Fine-Grain Multithreading. In Languages and Compilers for Parallel Computing: 27th International Workshop (LCPC 2014), September 15-17, 2014, Hillsboro, Oregon. Lecture Notes in Computer Science, edited by J Brodman and P Tu, 8967, 161-175. New York, New York:Springer. PNNL-SA-104854. doi:10.1007/978-3-319-17473-0_11