June 17, 2014
Journal Article

Detecting differential protein expression in large-scale population proteomics

Abstract

Mass spectrometry-based high-throughput quantitative proteomics shows great potential in clinical biomarker studies, identifying and quantifying thousands of proteins in biological samples. However, methods are needed to appropriately handle issues/challenges unique to mass spectrometry data in order to detect as many biomarker proteins as possible. One issue is that different mass spectrometry experiments generate quite different total numbers of quantified peptides, which can result in more missing peptide abundances in an experiment with a smaller total number of quantified peptides. Another issue is that the quantification of peptides is sometimes absent, especially for less abundant peptides and such missing values contain the information about the peptide abundance. Here, we propose a Significance Analysis for Large-scale Proteomics Studies (SALPS) that handles missing peptide intensity values caused by the two mechanisms mentioned above. Our model has a robust performance in both simulated data and proteomics data from a large clinical study. Because varying patients’ sample qualities and deviating instrument performances are not avoidable for clinical studies performed over the course of several years, we believe that our approach will be useful to analyze large-scale clinical proteomics data.

Revised: September 30, 2014 | Published: June 17, 2014

Citation

Ryu S., W. Qian, D.G. Camp, R.D. Smith, R.G. Tompkins, R.W. Davis, and W. Xiao. 2014. Detecting differential protein expression in large-scale population proteomics. Bioinformatics 30, no. 19:2741-2746. PNNL-SA-100193. doi:10.1093/bioinformatics/btu341