Skip to Main Content U.S. Department of Energy
Science Directorate
Page 412 of 559

Biological Sciences Division
Research Highlights

November 2008

Journal Cover Features De Novo Sequencing Approach for Identifying Proteins

Strategy combines unique sequence tags with spectrum analysis to characterize protein modifications.

The cover of the October 15 issue of Analytical Chemistry depicts procedures used to determine protein post-translational modifications from the de novo-sequenced UStags on a background of partial sequences
The cover of the October 15 issue of Analytical Chemistry depicts procedures used to determine protein post-translational modifications from the de novo-sequenced UStags on a background of partial sequences. Enlarged View

Results: Scientists have developed a novel strategy that expands protein identifications beyond those predicted from genomic sequences. Their approach can be used to probe post—translational—or chemical—modifications of proteins, adding knowledge to studies of heart disease, cancer, neurodegenerative diseases, diabetes and essentially any other area of biology. Their work was featured on the October 15, 2008, cover of Analytical Chemistry.

De novo (Latin for "from the beginning") sequencing is a spectrum analysis approach for mass spectrometry data to discover post-translational modifications in proteins. Its advantage is that it derives a peptide sequence without the help of a database of known protein sequences. The approach is still in its infancy and is not widely applied to proteomics because of its limited reliability.

However, Pacific Northwest National Laboratory scientists developed a de novo approach for sequencing and discovering protein modifications that is based on identifying the proteome's unique sequence tags, called UStags. A UStag is a short portion of the amino acid sequence of variable length that is unambiguously and uniquely associated with a single protein. The combined de novo—UStag approach complements the UStag method previously reported (see Assigning Proteins ID Cards) by enabling the discovery of new protein modifications.

Why it matters: Protein modifications may alter a protein's physical and chemical properties, folding, conformation distribution, stability, activity and consequently, function. Modifications often play a key role in protein signaling, controlling many important cellular processes.  In turn, these alterations can lead to disease. The de novo—UStag approach developed at PNNL will enable scientists to identify proteins and their modifications with more certainty and lower false discovery rates. They can discover protein post-translational modifications, including complex multiple unknown/unexpected modifications on a single protein sequences, and discover sequence mutations and genome-predicted database sequence errors.

Methods: The scientists obtained the de novo information from Fourier-transform tandem mass spectrometry data for peptides and polypeptides from a yeast. The de novo sequences were selected based on filter levels designed to provide a limited yet high quality subset of UStags. The DNA-predicted database protein sequences were then compared to the UStags, and the differences observed across or in the UStags were used to infer possible sequence modifications.

The most attractive aspect of de novo sequencing is its potential to discover protein sequences that differ from those predicted from the genome sequence as a result of amino acid modifications and sequence mutations (or database errors). This work demonstrated this potential by revealing both unexpected and complex multiple protein modifications. This development will be useful across a wide range of biological applications, and particularly in the extensive efforts in the large proteomics facility at PNNL, which is engaged in many applications currently supported by the U.S. Department of Energy and the National Institutes of Health.

Acknowledgments: Pacific Northwest National Laboratory is advancing science to achieve predictive understanding of multi-cellular biological systems. The research team at PNNL includes Yufeng Shen, Nikola Tolic, Kim Hixson, Samuel Purvine, Ljiljana Paša-Tolic, Weijun Qian, Josh Adkins, Ron Moore, Gordon Anderson, and Dick Smith. This research was supported by the Environmental Molecular Sciences Laboratory (EMSL) internal funds, the DOE Office of Biological and Environmental Research, and the NIH National Center for Research Resources. Work was performed in DOE's EMSL, a national scientific user facility at PNNL.

References: Shen Y, N Tolic, KK Hixson, SO Purvine, GA Anderson, and RD Smith. 2008. "De novo sequencing of unique sequence tags for discovery of post-translational modifications of proteins." Analytical Chemistry 80(6):7742-7754.

Shen Y, N Tolic, KK Hixson, SO Purvine, L Paša-Tolic, WJ Qian, JN Adkins, RJ Moore, and RD Smith. 2008. "Proteome-wide identification of proteins and their modifications with decreased ambiguities and improved false discovery rates using unique sequence tags." Analytical Chemistry 80(6):1871-1882.

Page 412 of 559

Science at PNNL

Core Research Areas

User Facilities

Centers & Institutes

Additional Information

Research Highlights Home


Print this page (?)

YouTube Facebook Flickr TwitThis LinkedIn