High-throughput (HTP) technologies offer the capability to evaluate the genome, proteome, and metabolome of organisms at a global scale. This opens up new opportunities to define complex signatures of disease that involve signals from multiple types of biomolecules. Integrating these data types however is difficult due to the heterogeneity of the data. We present a Bayesian approach to integration that uses posterior probabilities to assign class memberships to samples using individual and multiple data sources; these probabilities are based on lower level likelihood functions derived from standard statistical learning algorithms. We demonstrate this approach on microbial infections of mice, where the bronchial alveolar lavage fluid was analyzed by two HTP proteomic and one HTP metabolomic technologies. We demonstrate that integration of the three datasets improves classification accuracy to ~89% from the best individual dataset at ~83%. In addition, we present a new visualization tool called Visual Integration for Bayesian Evaluation (VIBE) that allows the user to observe classification accuracies at the class level and evaluate classification accuracies on any subset of available data types based on the posterior probability models defined for the individual and integrated data.
Revised: December 28, 2010 |
Published: March 1, 2009
Citation
Webb-Robertson B.M., L.A. McCue, N. Beagley, J.E. McDermott, D.S. Wunschel, S.M. Varnum, and J.Z. Hu, et al. 2009.A Bayesian Integration Model of High-Throughput Proteomics and Metabolomics Data for Improved Early Detection of Microbial Infections. In Pacific Symposium on Biocomputing, 14, 451-463. Singapore:World Scientific Publishing Co.PNNL-SA-61531.