Motivation: Quantitative mass spectrometry-based proteomics requires protein-level estimates and associated confidence measures. Challenges include the presence of low quality or incorrectly identified peptides and informative missingness. Furthermore, models are required for rolling peptide-level information up to the protein level. Results: We present a statistical model that carefully accounts for informative missingness in peak intensities and allows unbiased, model-based, protein-level estimation and inference. The model is applicable to both label-based and label-free quantitation experiments. We also provide automated, model-based, algorithms for filtering of proteins and peptides as well as imputation of missing values. Two LC/MS datasets are used to illustrate the methods. In simulation studies, our methods are shown to achieve substantially more discoveries than standard alternatives. Availability: The software has been made available in the opensource proteomics platform DAnTE (http://omics.pnl.gov/software/). Contact: adabney@stat.tamu.edu Supplementary information: Supplementary data are available at Bioinformatics online.
Revised: September 20, 2010 |
Published: August 15, 2009
Citation
Karpievitch Y., J.R. Stanley, T. Taverner, J. Huang, J.N. Adkins, C. Ansong, and F. Heffron, et al. 2009.A Statistical Framework for Protein Quantitation in Bottom-Up MS-Based Proteomics.Bioinformatics 25, no. 16:2028-2034.PNNL-SA-70100.