Computational Biology and Bioinformatics
Supercomputers in the Environmental Molecular Sciences Laboratory’s Molecular Sciences Computing Facility are used for computational modeling of complex biological systems.
Advancements in biology in the 21st century will be dominated by research at the convergence of biological, physical, and information sciences. Researchers at Pacific Northwest National Laboratory (PNNL) are addressing the computational needs of modern biology by developing the infrastructure, databases, and software that are key to systems biology research. Our bioinformaticists are developing computational methods for biological modeling. We are devising new, high-throughput data management technologies to capture, store, access, and analyze vast amounts of genomics, proteomics, and metabolomics data.
PNNL's computational biology and bioinformatics researchers develop, efficiently implement, and apply computational tools, with a focus on the end user. Our collaborative approach directly involves scientists to help design intuitive tools that precisely target the unique needs of the biological science domain.
PNNL’s computational biology and bioinformatics capabilities enable all of our Laboratory Research and Development projects in systems biology. For example, cataloging and organizing gene regulatory families and defining their relationships to protein expression and protein complex formation require capabilities that organize data, permit pattern recognition, and allow predictive models to be formulated around hypothesis-driven principles. Our computational biology and bioinformatics research and development efforts are focused on three main areas: computational modeling, bioinformatics, and computational infrastructure.
At PNNL, computational biology facilitates the understanding of cell behavior by creating sophisticated mathematical- and computer-based models. Computational modeling provides validation as well as modeling-enriched analysis for experiments that cannot otherwise be run. Network modeling improves the understanding of how cells sense their environment and respond to environmental stimuli. Molecular modeling focuses on molecular dynamics simulation and biomolecular systems analysis.
Our complex models and simulations use a broad range of biological information, including high-throughput genomics and proteomics data. These advanced, mechanistic models allow scientists to create new hypotheses to analyze the ever-expanding mass of biological data.
Research in computational modeling is wide-ranging at PNNL and includes biomolecular modeling and simulation, kinetic modeling, and network analysis of cell signaling and metabolic pathways. Biomolecular systems currently being studied at PNNL include protein-protein and protein-DNA complexes, bacterial membranes, and membrane-mineral systems. For modeling these large and complicated systems, researchers use machines available in the Environmental Molecular Sciences Laboratory’s Molecular Science Computing Facility, such as the massively parallel processor.
Bioinformatics involves the design, development, and application of the computer systems and software that enable scientists to explore high-throughput data from gene expression microarray ex
periments, mass spectrometry-based peptide and protein identification experiments, and various quantitative measures on metabolic state and metabolites. PNNL bioinformatic scientists design and develop novel computer systems to store and integrate such data with prior knowledge of biological pathways and common regulatory mechanisms. The purpose of these systems is to yield as full a description of the cellular state as possible. For example, they assist researchers to discover the constituents and states of metabolic and signaling pathways and to understand the connections and dynamic behavior in the genetic transcriptional and protein-protein interaction networks.
Bioinformatic scientists at PNNL develop diverse computational tools and techniques to facilitate systems biology research, including tools for high-performance computing, visualization, automated inference, and machine learning. Their work also includes:
- reconstructing the "wiring diagrams" of biological networks (via Bayesian network, dependency network, and other models that incorporate background information of varying confidence)
- creating simulation algorithms of network behavior
- developing algorithms that link high-throughput analysis results to biological databases for annotation and interpretation
- text mining of external public data sources.
The design, development, and implementation of computational infrastructures provides researchers with the tools necessary to support data acquisition, metadata tracking, data storage, data retrieval, and analysis capabilities within a structured framework. PNNL is building a computational infrastructure to support both the experimental and computational processes for high-throughput gene expression, proteomic, and metabolomic system research.
This work includes the development of laboratory information systems integrated with data management systems to provide a complete representation of the experimental data sets. Our research into the design and development of collaboratories and problem solving environments provides resource discovery and access to a heterogeneous suite of data, modeling, simulation, and analysis tools from a common framework. We are also developing visualization software to facilitate the analysis of experimental data and computational results.