This report presents the results from the 2016 ABRF Proteome Informatics Research Group (iPRG) study on proteoform inference and FDR estimation from bottom-up proteomics data. For this study, three replicate Q Exactive Orbitrap LC-MS/MS datasets were generated from each of four E. coli samples spiked with different equimolar mixtures of small recombinant proteins selected to mimic pairs of homologous proteins. Participants were given raw data and a sequence file, and asked to identify the proteins and provide estimates on the false discovery rate at the proteoform level. As part of this study, we tested a new submission system with a format validator running on a virtual private server (VPS) and allowed methods to be provided as executable R Markdown or IPython Notebooks. The task was perceived as difficult, and only eight unique submissions were received, though those who participated did well, with no one method performing best on all samples. However, none of the submissions included a complete Markdown or Notebook, even though examples were provided. Future iPRG studies need to be more successful in promoting and encouraging participation. The VPS and submission validator easily scale to much larger numbers of participants in these types of studies. The unique “ground-truth” dataset for proteoform identification generated for this study is now available to the research community, as are the server-side scripts for validating and managing submissions.
Revised: January 11, 2019 |
Published: July 1, 2018
Citation
Lee J., H. Choi, C. Colangelo, D. Davis, M. Hoopmann, L. Kall, and H. Lam, et al. 2018.ABRF Proteome Informatics Research Group (iPRG) 2016 Study: Inferring Proteoforms from Bottom-up Proteomics Data.Journal of Biomolecular Techniques:JBT 29, no. 2:39–45.PNNL-SA-133941.doi:10.7171/jbt.18-2902-003