The sparse triangular solve kernels, SpTRSV and SpTRSM, are important building blocks for a number of numerical linear algebra routines. Parallelizing SpTRSV and SpTRSM on today's many-core platforms, such as GPUs, is not an easy task since computing a component of the solution may depend on previously computed components, enforcing a degree of sequential processing. As a consequence, most existing work introduces a preprocessing stage to partition the components into a group of level-sets or colour-sets so that components within a set are independent and can be processed simultaneously during the subsequent solution stage. However, this class of methods requires a long preprocessing time as well as significant runtime synchronization over-heads between the sets. To address this, we
Revised: August 14, 2019 |
Published: November 10, 2017
Citation
Liu W., A. Li, J.D. Hogg, I.S. Duff, and B. Vinter. 2017.Fast synchronization-free algorithms for parallel sparse triangular solves with multiple right-hand sides.Concurrency and Computation: Practice and Experience 29, no. 21:Article No. e4244.PNNL-SA-130501.doi:10.1002/cpe.4244