Skip to Main Content U.S. Department of Energy
Computing Research

Applied Statistics

PNNL is home to one of the largest and most accomplished teams of statisticians and computational modelers in the DOE laboratory complex. These scientists use the foundations of statistical and mathematical theory to develop high-quality, defensible methods and applications to address critical scientific challenges. This research often involves large, complex, or streaming data and is deployed across a variety of compute architectures, including public/private clouds and supercomputers. The team strives to disseminate statistical research to the public via open-source statistical software.

Key Capabilities

  • Experimental Design
  • Anomaly Detection
  • Linear and Nonlinear Modeling
  • Machine Learning
  • Heterogeneous Data Integration
  • Large-scale Computational Statistics
  • Computational Human Behavior Modeling
  • Feature Discovery
  • Model Verification and Validation
  • Data Science Algorithm Development and Evaluation

Significant Projects

P-Mart Website


P-Mart is an interactive, web-based software environment that enables biomedical and biological scientists to perform in-depth analyses of global proteomics data without in-depth knowledge of statistical programming. P-Mart offers a series of statistical modules associated with:
  • Quality assessment
  • Peptide and protein statistics
  • Protein quantification
  • Exploratory data analyses
P-Mart offers access to multiple cancer proteomic datasets generated through the Clinical Proteomics Tumor Analysis Consortium at the peptide, gene, and protein levels. Users also can upload private data via three easy-to-format data files. Analyses are performed via customized workflows and interactive visualizations. The final report and results files are in .csv format, allowing any analysis to be reproduced easily by attachment to a publication.

A subset of the R functions currently is available via GitHub.
The web-service can be installed via Docker Hub.

VSP Website

Visual Sample Plan

Visual Sample Plan, or VSP, is a software tool that supports development of a defensible sampling plan. Confident decision-making is supported based on statistical sampling theory and the statistical analysis of sample results. VSP couples site, building, and sample location visualization capabilities with optimal sampling design and statistical analysis strategies. It currently focuses on design and analysis for numerous applications, including:
  • Environmental characterization and remediation
  • Environmental monitoring and stewardship
  • Response and recovery of chemical/biological/radiation terrorist event
  • Footprint reduction and remediation of unexploded ordnance (UXO) sites
  • Sampling of soils, buildings, groundwater, sediment, surface water, and subsurface layers.

Computational Modeling of Human Behavior

PNNL researchers have developed an approach to social-behavioral modeling based on Bayesian networks. These networks are a widely used inference modeling approach that enables evaluation of hypotheses considering available evidence. When developed in conjunction with social science literature, Bayesian networks can be extremely useful in modeling human and organizational behavior. PNNL’s approach combines technical data and expert judgment in assessing the likelihood that a model hypothesis changes as new information is obtained. In modeling human behavior, it can be difficult to obtain datasets of sufficient size or completeness to estimate the parameters of a Bayesian network. PNNL developed a unique methodology for using structured elicitation with subject matter experts to estimate model parameters. Once constructed, these models can be used by analysts to assess the likelihood of an outcome given observed evidence over time.

Computing Research

Research Areas