Optimizing Metaproteomics Database Construction: Lessons from a Study of the Vaginal Microbiome

February 15, 2024

Journal Article

Optimizing Metaproteomics Database Construction: Lessons from a Study of the Vaginal Microbiome

Abstract

Metaproteomics, a method for untargeted, high-throughput identification of proteins in complex samples, provides functional information about microbial communities and can tie functions to specific taxa. Metaproteomics often generates less data than other omics techniques, but analytical workflows can be improved to increase usable data in metaproteomic outputs. Identification of peptides in metaproteomic analysis is performed by comparing mass spectra of sample peptides to a reference database of protein sequences. Although these protein databases are an integral part of metaproteomic analysis, few studies have explored how database composition impacts peptide identification. Here, we used cervicovaginal lavage samples from a study of bacterial vaginosis to compare the performance of databases built using six different strategies. We evaluated broad versus sample-matched databases, as well as databases populated with proteins translated from metagenomic sequencing of the same samples versus sequences from public repositories. Smaller sample-matched databases performed significantly better, driven by the statistical constraints on large databases. Additionally, large databases attributed up to 34% of significant bacterial hits to taxa absent from the sample, as determined orthogonally by 16S rRNA gene sequencing. We also tested a set of hybrid databases which included bacterial proteins from NCBI RefSeq and translated bacterial genes from the samples. These hybrid databases had the best overall performance, identifying 1,068 unique human and 1,418 unique bacterial proteins, ~30% more than a database populated with proteins from typical vaginal bacteria and fungi. Our findings can help guide the optimal identification of proteins while maintaining statistical power for reaching biological conclusions.

Published: February 15, 2024

Citation

Lee E., S. Srinivasan, S.O. Purvine, T.L. Fiedler, O.P. Leiser, S. Proll, and S.S. Minot, et al. 2023. Optimizing Metaproteomics Database Construction: Lessons from a Study of the Vaginal Microbiome. mSystems 8, no. 4:e0067822. PNNL-SA-180853. doi:10.1128/msystems.00678-22

Research topics

Human Microbiome

PNNL

Optimizing Metaproteomics Database Construction: Lessons from a Study of the Vaginal Microbiome

Abstract

Citation

Research topics

Syntrophic Bacterial and Host-Microbe Interactions in Bacterial Vaginosis

Affinity- and activity-based probes synthesized from structurally diverse hops-derived xanthohumol flavonoids reveal highly varied protein profiling in Escherichia coli

Functional Rhythmicity of Gut Microbial Enzyme Is Influenced by Feeding Patterns