Skip to Main Content U.S. Department of Energy
Science Directorate
Page 124 of 559

Biological Sciences Division
Research Highlights

November 2015

Now Available: Seeds from the Tree of Life

Massive dataset dramatically improves access to information on environmentally important microbes

Tree of life
PNNL’s public release of more than 35,000 files with detailed information about important microbes can help scientists study the entire "tree of life." Image: Nathan Johnson
zoom Enlarge Image.

Results: For generations, scientists have been cataloging the tree of life—the branching, interconnected collection of species that live on Planet Earth. Bit by bit, they have developed a library of molecular characterizations of various species, a library that just grew substantially, thanks to research at Pacific Northwest National Laboratory. Scientists there have released more than 35,000 files related to a decade of research on microbial species. These files describe more than 100 microbial species, including many environmental strains and human pathogens. The scientists hope to promote wider use of this important resource.

"These are truly diverse species," said computational biologist Dr. Sam Payne, who led the team compiling and indexing the dataset. "This Biodiversity Library includes information on bacteria that are model microorganisms for studying photosynthesis, carbon and nitrogen assimilation, evolution of plant plastids, and adaptability to environmental stresses. You might say we are making available seeds from the tree of life so other scientists can study species in greater detail."

Why It Matters: Detailed molecular data can change the way we research and think about biological systems. Scientists look at the structure, function, and evolution of genes (genomics), analyze the interactions with proteins these genes produce (proteomics), and systematically track the chemical traces of cellular processes (metabolomics). Computational biology research, which attempts to identify novel biological phenomena using these detailed data, depends on publically available information to test new theories.  

The dataset PNNL made available comes from hundreds of collaborative projects. Scientists packaged the data from these projects in the PNNL Biodiversity Library in multiple ways to help other researchers from different disciplines access the information. Biochemical data on most species is broad and deep to support additional analyses.

Methods: The scientists sought to share their wealth of information by creating an environmentally diverse public repository. While a portion of the data has been freely available through the team's website for almost a decade, the sheer size of the complete library made broad distribution impossible. Recently, the ProteomeXchange repository system enabled accommodation of significantly larger data volumes. The release of the massive dataset was highlighted in a recent issue of Scientific Data, a new type of scientific publication designed to promote an in-depth understanding of research datasets.

What's Next? With the Biodiversity Library, scientists can leverage identifications across an entire phylum, or perhaps the entire tree of life. PNNL will target new data acquisition to specific taxa that are currently underrepresented, hoping to double the phylogenetic diversity of the dataset.


Sponsors: The U.S. Department of Energy's (DOE's) Office of Science, Biological and Environmental Research funded this project via the Early Career Research Program and the Genomic Science Program.

User Facility: Data in the Biodiversity Library came from projects performed in the Environmental Molecular Science Laboratory (EMSL), a DOE national scientific user facility at PNNL.

Research Team: Sam Payne, Matthew Monroe, Christopher Overall, Gary Kiebel, Michael Degan, Bryson Gibbons, Grant Fujimoto, Samuel Purvine, Joshua Adkins, Mary Lipton, and Richard D. Smith, Pacific Northwest National Laboratory.

Reference: Payne S, M Monroe, C Overall, G Kiebel, M Degan, B Gibbons, G Fujimoto, S Purvine, J Adkins, M Lipton, and R Smith. 2015. "The Pacific Northwest National Laboratory library of bacterial and archaeal proteomic biodiversity." Scientific Data 2(150041). DOI: 10.1038/sdata.2015.41

Page 124 of 559

Science at PNNL

Core Research Areas

User Facilities

Centers & Institutes

Additional Information

Research Highlights Home


Print this page (?)

YouTube Facebook Flickr TwitThis LinkedIn