Evaluation of Integrated Assessment Model Hindcast Experiments: a Case Study of the GCAM 3.0 Land Use Module

January 16, 2018

Feature

Evaluation of Integrated Assessment Model Hindcast Experiments: a Case Study of the GCAM 3.0 Land Use Module

Applying metrics in a case study, scientists found that no single evaluation measure likely exists for all variables in an integrated assessment model

Media Contact: PNNL News & Media Relations

Thumbnail — The top figure, which applies a popular global metric, shows that global error is nearly zero.

The Science

Integrated assessment modelers are increasingly conducting hindcast experiments—producing a model forecast for a time period in which observational data are available—across many scales of models. However, a community standard for evaluating integrated assessment models (IAMs) does not exist, making it more difficult to compare results of hindcast experiments from different models. Researchers at the U.S. Department of Energy's Pacific Northwest National Laboratory presented different evaluation metrics for model output that provide information about varying aspects of model performance and, to aid interpretation, identified helpful performance benchmarks for several of these metrics. They then applied the metrics in a case study.

The Impact

Due to the structure of most IAMs, global aggregate metrics or otherwise highly aggregated skill scores commonly used to evaluate IAM hindcast experiments are likely to mask important deficiencies. Researchers identified an easy-to-manage suite of metrics to evaluate different features of model performance. This suite of metrics may be particularly useful in parameter estimation studies to help identify parameter values that best align with historical data.

Summary

Many types of performance statistics exist for IAMs and other models, but a large number of them operate on a pass-fail basis and offer little insight into why models fail. To make evaluating the large number of variable-region combinations in IAMs more feasible, researchers selected a set of measures that can be applied at different spatial scales (regional versus global). They also identified performance benchmarks for these measures (based on the statistics of the observational data set) that allow models to be evaluated in absolute versus relative terms. An ideal evaluation method for hindcast experiments in IAMs would feature absolute measures for evaluation of a single experiment for a single model. This method also would include relative measures to compare the results of multiple experiments for a single model or the same experiment repeated across multiple models. The performance benchmarks provide information about why a model might perform poorly on a given measure and therefore identify opportunities for improvement.

To demonstrate the use and types of results possible with the evaluation method, researchers applied the measures to results from a past hindcast experiment focused on land allocation in the Global Change Assessment Model (GCAM) version 3.0. Researchers found quantitative evidence that global aggregate metrics alone are insufficient for evaluating IAMs like GCAM that require global supply to equal global demand at each time period. These results indicate that no single evaluation measure likely exists for all variables in an IAM, and therefore sector-by-sector evaluation might be necessary.

Acknowledgments

Sponsors: This research was based on work supported by the U.S. Department of Energy Office of Science, Biological and Environmental Research as part of the Integrated Assessment Research program.

Reference: A.C. Snyder, R.P. Link, K.V. Calvin, "Evaluation of Integrated Assessment Model Hindcast Experiments: a Case Study of the GCAM 3.0 Land Use Module." Geoscientific Model Development 10, 4307-4319 (2017). [DOI: 10.5194/gmd-10-4307-2017]

Download Publication

Evaluation of integrated assessment model hindcast experiments: a case study of the GCAM 3.0 land use module

Key Capabilities

Applied Mathematics

Earth Systems Science & Engineering

Advanced Computer Science, Visualization, & Data

SEE ADDITIONAL CAPABILITIES

###

About PNNL

Pacific Northwest National Laboratory draws on its distinguishing strengths in chemistry, Earth sciences, biology and data science to advance scientific knowledge and address challenges in energy resiliency and national security. Founded in 1965, PNNL is operated by Battelle and supported by the Office of Science of the U.S. Department of Energy. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit the DOE Office of Science website. For more information on PNNL, visit PNNL's News Center. Follow us on Twitter, Facebook, LinkedIn and Instagram.

Published: January 16, 2018

Research Team

Abigail Snyder, Robert Link, and Kate Calvin, PNNL (Joint Global Change Research Institute)

Research topics

Earth & Coastal Sciences

Scientific Discovery

PNNL

Evaluation of Integrated Assessment Model Hindcast Experiments: a Case Study of the GCAM 3.0 Land Use Module

Download Publication

Key Capabilities

Research Team

Research topics

Integrating AI into Biological Research

PNNL Powers Biotechnology, Grid Operations, Nuclear Science Through Genesis AI for Science Mission

Discoveries Rewrite How Some Minerals Form and Dissolve

Evaluation of Integrated Assessment Model Hindcast Experiments: a Case Study of the GCAM 3.0 Land Use Module

Download Publication

File

Key Capabilities

Research Team

Research topics