March 15, 2013
Journal Article

Sequential Projection Pursuit Principal Component Analysis – Dealing with Missing Data Associated with New -Omics Technologies

Abstract

We present a new version of sequential projection pursuit Principal Component Analysis (sppPCA) that has the capability to perform PCA on large multivariate datasets that contain non-random missing values. We demonstrate that sppPCA generates more robust and informative low-dimensional representations of the data than imputation-based approaches and improved downstream statistical analyses, such as clustering or classification. A Java program to run sppPCA is freely available at https://www.biopilot.org/docs/Software/sppPCA.

Revised: August 16, 2018 | Published: March 15, 2013

Citation

Webb-Robertson B.M., M.M. Matzke, T.O. Metz, J.E. McDermott, J. Walker, K.D. Rodland, and J.G. Pounds, et al. 2013. Sequential Projection Pursuit Principal Component Analysis – Dealing with Missing Data Associated with New -Omics Technologies. BioTechniques 54, no. 3:165-168. PNNL-SA-87092.