BioPilot: Data Intensive Computing for Complex Biological Systems
Dr. T.P. Straatsma
Associate Division Director
Through the BioPilot project, newly developed simulations of complex biological phenomena are quite promising. Pictured is a space filling model of the outer membrane of Pseudomonas aeruginosa showing the extent to which water molecules can penetrate. Enlarge image
Results: In a joint research project between the Pacific Northwest National Laboratory and Oak Ridge National Laboratory, scientists are building an integrated suite of flexible, high-performance computational capabilities to enable large-scale predictive modeling and simulation of complex biological systems associated with systems biology. The BioPilot project is developing the mathematics, algorithms, and efficient software implementations for computing architectures aimed at systems biology problems.
Researchers on the BioPilot project have designed novel algorithms that take advantage of globally addressable memory architectures, prototyped codes for these environments, and evaluated their performance. Their project focuses on the key data-intensive biological applications where advanced computing capabilities are in the highest demand, and explores the feasibility of these architectures for the large-scale production needs of the biology community. BioPilot represents an important step on the path from various data types to meaningful large-scale models and simulations of complex biological systems.
Why it matters: Systems biology is the systematic study of complex interactions in biological systems and how these interactions give rise to the function and behavior of that system. Conventional computational approaches cannot easily address system's biology problems. Biological computing problems are typically data-intensive and require the use massively parallel computing resources. New and innovative tools and effective ways to analyze the enormous amounts of data being generated are needed.
The work in the BioPilot project will have a broad impact on many data-intensive applications in systems biology, including energy production, carbon sequestration, environmental cleanup, and national security - all important DOE missions.
Methods: Transforming biological research from a qualitative, descriptive science to quantitative, predictive science requires the integration of modern high-throughput experimental technologies with computational data-intensive analysis and high-performance modeling and simulation. This transformation is critical for addressing the important challenges in energy and environmental security.
Predictive high performance simulations of biological systems' dynamics require starting with data-driven model-building from large scale experimental or simulation data. The BioPilot project involves the development of computational capabilities for large scale analysis of available data to obtain the elementary components of biological systems, tools for the abstraction of biological models from these components, and the high performance codes for predictive simulation of these models.
Analysis of simulation results then leads to predictions of biological systems' functions.
Ultimately, these predictions generate specific hypotheses that can be experimentally tested. Comparisons between predictions and experiments can then be used to refine models to reflect improved understanding of biological systems.
What's next: Researchers in the BioPilot project are developing new approaches for dealing with biological data, and collaborate with researchers in a variety of bioscience projects aiming to more fully understand the carbon cycle, biofuel production, or bioremediation, and other DOE missions in energy and environmental research. The ultimate confirmation of the BioPilot strategy will be new insights, hypotheses, and discoveries by research teams using the developed algorithms.
Sponsor - The multiyear BioPilot project is supported by the U.S. Department of Energy's Office of Advanced Scientific Computing Research as a joint research effort between the Pacific Northwest National Laboratory and Oak Ridge National Laboratory. Some of this work was conducted in the Environmental Molecular Sciences Laboratory, a Department of Energy national scientific user facility located at the Pacific Northwest National Laboratory.
Research Team - Dr. T.P. Straatsma (PNNL) and Dr. Nagiza Samatova (ORNL) serve as principal investigators for the Biopilot project. Other members of the team are: Dr. William Cannon, Dr. Haluk Resat, Dr. Roberto D. Lins, Dr. Heidi Sofia, and Dr. Christopher Oehmen, PNNL; Dr. Andrey Gorin, Dr. Ed Uberbacher, Dr. Tatiana Karpinets, Dr. Byung-Hoon Park, and Dr. Chongle Pan, ORNL.