March 1, 2014
Journal Article

Parrallel Implementation of Fast Randomized Algorithms for Low Rank Matrix Decomposition

Abstract

We analyze the parallel performance of randomized interpolative decomposition by de- composing low rank complex-valued Gaussian random matrices larger than 100 GB. We chose a Cray XMT supercomputer as it provides an almost ideal PRAM model permitting quick investigation of parallel algorithms without obfuscation from hardware idiosyncrasies. We obtain that on non-square matrices performance scales almost linearly with runtime about 100 times faster on 128 processors. We also verify that numerically discovered error bounds still hold on matrices two orders of magnitude larger than those previously tested.

Revised: September 1, 2015 | Published: March 1, 2014

Citation

Lucas A.J., M. Stalizer, and J.T. Feo. 2014. Parrallel Implementation of Fast Randomized Algorithms for Low Rank Matrix Decomposition. Parallel Processing Letters 24, no. 1:Article No. 1450004. PNNL-SA-91896. doi:10.1142/S0129626414500042