November 1, 2007
Journal Article

Data driven computing for Biological Systems

Abstract

Biological breakthroughs that can lead to improved diagnosis and treatment of diseases, generation of clean energy, and solutions to other critical societal problems require high performance, data-intensive computational tools that have the ability to process, analyze and cohesively integrate massive amounts of data and information in real time. Biological computing problems are typically data-intensive and must share very large sets of data effectively across many processors. However, the various components of biological systems, composed of complex networks and pathways, must be integrated to gain a coherent understanding of the system. The more different types of data that can be integrated, the deeper the insights into the biology of the system being studied. Conventional analysis software, however, hasn’t been able to efficiently deal with such massive data set. The goal of the Data-Intensive Computing for Complex Biological Systems (BioPilot) project, a multiyear project funded by the U.S. Department of Energy’s Office of Advanced Scientific Computing Research (ASCR), is to create an integrated suite of highly flexible, highly adaptable pipelines of computational tools for analyzing large-scale data sets that will be used to address specific challenges facing the U.S. Department of Energy (DOE) and our society.

Revised: April 24, 2009 | Published: November 1, 2007

Citation

Samatova N.F., A. Gorin, E. Uberbacher, T.V. Karpinets, B. Park, C. Pan, and T. Straatsma, et al. 2007. Data driven computing for Biological Systems. SciDAC Review 5, no. Fall 2007:10-25. PNNL-SA-55247.