September 1, 2020
Journal Article

Flying Blind, or Just Flying Under the Radar? The Underappreciated Power of De Novo Methods of Mass Spectrometric Peptide Identification

Abstract

Mass spectrometry-based proteomics is a popular and powerful method for precise and highly multiplexed protein identification. The most common method of analyzing untargeted proteomics data is called database searching, where the database is simply a collection of protein sequences from the target organism, derived from genome sequencing. Experimental peptide tandem mass spectra are compared to simplified models of theoretical spectra calculated from the translated genomic sequences. However, in several interesting application areas, such as forensics, archaeology, venomics and others, a genome sequence may not be available, or the correct genome sequence to use is not known. In these cases, de novo peptide identification can play an important role. De novo peptide identification infers peptide sequence directly from the tandem mass spectrum without reference to a sequence database, usually using graph-based or machine learning algorithms. In this review, we provide a basic overview of de novo peptide identification methods and applications, briefly covering de novo algorithms and tools, and focusing in more depth on recent applications from venomics, metaproteomics, forensics, and characterization of antibody drugs.

Revised: September 29, 2020 | Published: September 1, 2020

Citation

O'Bryon I., S.C. Jenson, and E.D. Merkley. 2020. Flying Blind, or Just Flying Under the Radar? The Underappreciated Power of De Novo Methods of Mass Spectrometric Peptide Identification. Protein Science 29, no. 9:1864-1878. PNNL-SA-153810. doi:10.1002/pro.3919