Advanced Computing, Mathematics and Data
Linked Software Gives Better Picture of Organisms' Behavior
A core task in systems biology is to reconstruct an organism's regulatory and metabolic networks as a basis for understanding biological behavior and modifying organisms for useful applications. Scientists at Pacific Northwest National Laboratory have made a significant step toward tackling that task by developing a pipeline of three linked software tools to analyze protein interaction and gene expression data. They then use this information to reconstruct the underlying biological networks.
Currently, scientists have no way to directly determine the structure of a biological network, so they have to "infer" the structure. Network inference is a computational approach to reconstruct the structure of the network from secondary data. The PNNL-developed tools, which are available online, provide the ability to infer, and thus reconstruct, biological networks.
The combined software tools, SEBINI, BEPro3 and CABIN aid in the more accurate determination of protein-protein interaction networks from affinity isolation experiments in less time with less effort than any other set of currently available software tools.
The first tool, the Software Environment for BIological Network Inference (SEBINI), enables analysis of high-throughput gene expression, protein expression or protein abundance data using a suite of state-of-the-art network inference algorithms. SEBINI also allows algorithm developers to compare and train network inference methods on artificial networks and simulated gene expression data. This means that SEBINI can be used by software developers who want to evaluate, refine or combine inference techniques, and by bioinformaticians to analyze experimental data.
Pictorial representation of the integrated pipeline Enlarged View
The second component is the Bayesian Estimator of Probabilities of Protein-Protein Associations (BEPro3), which identifies protein-protein interactions from mass-spectrometry-based experiments much more efficiently than previous methods. BEPro3 provides estimates of the probability that an identified interaction between two proteins is real. The estimates are based on the data gathered across many different experiments. The set of identified interactions can be stored as a protein-protein interaction network that fits comfortably within the SEBINI framework.
The networks inferred by BEPro3 or one of the algorithms in SEBINI can be automatically passed on to a third tool, Collective Analysis of Biological Interaction Networks (CABIN), for further analysis, such as network validation or network expansion using public bioinformatics data. Interactions from public databases can also be fed into CABIN as an "evidence network" that can be compared with the networks generated by SEBINI. Using CABIN, researchers can critically evaluate the likelihood of each network interaction, as well as extend the network by combining it with other information, such as known protein-protein interactions or target genes identified for known transcription factors.
A copy of SEBINI can be downloaded that includes both CABIN and BEPro3 installed by contacting the webmaster on the SEBINI public demo site.
Acknowledgments: This work is advancing science to achieve predictive understanding of multi-cellular biological systems. Development of SEBINI and CABIN was supported by the Biomolecular Systems Laboratory Directed Research and Development Initiative at PNNL. Work on BEPro3 and the integrated pipeline was supported by the U.S. Department of Energy's Genomics: GTL program within the Office of Biological and Environmental Research. The data analysis pipeline team includes Ronald C. Taylor, Mudita Singhal, Don S. Daly, Anuj Shah, William R. Cannon, Amanda M. White, Deanna L. Auberry, Ken J. Auberry, Brian S. Hooker, Kristin D. Victry and H. Steven Wiley, all PNNL, and Greg Hurst, Hayes McDonald, Dale Pelletier and Denise Schmoyer, Oak Ridge National Laboratory.
Taylor RC, M Singhal, DS Daly, JM Gilmore, KO Domico, AM White, DL Auberry, KJ Auberry, BS Hooker, GB Hurst, JE McDermott, WH McDonald, DA Pelletier, DA Schmoyer, and WR Cannon. 2008. "An analysis pipeline for the inference of protein-protein interaction networks." International Journal of Data Mining and Bioinformatics (in press).
Sharp JL, KK Anderson, GB Hurst, DS Daly, DA Pelletier, WR Cannon, DL Auberry, DD Schmoyer, WH McDonald, AM White, BS Hooker, KD Victry, MV Buchanan, V Kery, and HS Wiley. 2007. "Statistically inferring protein-protein associations with affinity isolation LC-MS/MS assays." Journal of Proteome Research 6(9):3788-3795.
Taylor RC, A Shah, C Treatman, and M Blevens. 2006. "SEBINI: Software Environment for BIological Network Inference." Bioinformatics 22(21):2706-2708.
Singhal M and KO Domico. 2007. "CABIN: Collective Analysis of Biological Interaction Networks." Computational Biology and Chemistry 31(3):222-225.