Improving Knowledge and Predictions of Hyporheic Zone Respiration via Continental-Scale Iterative ICON-ModEx Science
ICON-ModEx PNNL Leadership Team*: Xingyuan Chen, Brieanne Forbes, Amy Goldman, Emily Rexer, Tim Scheibe, James Stegen
ICON-ModEx Parallel Works Leadership Team*: Stefan Gary and Michael Wilde
*Names are listed in alphabetical order.
Below is an overview of the project’s research motivation and methods, details on how to get involved in sampling, and information on other ways to engage. Reach out to firstname.lastname@example.org to learn more. For updates about the collaborative manuscript effort, scroll to the bottom of the page.
Summary of opportunities (see below for details):
Ask the project team to give a virtual presentation to your classroom about the science and hands-on opportunities.
Contribute by helping collect samples using WHONDRS-supplied materials at locations near you.
Join the development of a manuscript using data from this project and/or access all data to pursue your own interests.
Expand your collaborative network via opportunities provided by WHONDRS.
River corridors are a vital aspect of the Earth system, contribute to global biogeochemical cycles, and influence water and habitat quality with local to regional implications. Within river corridors, the microbes in sediments immediately below the river bed (i.e., in the hyporheic zone) are often a primary control over river corridor biogeochemical function. These contributions vary tremendously across river corridors whereby hyporheic zones contribute from 4-96% of stream respiration. Aerobic respiration rates are a dominant biogeochemical process in river corridors and in hyporheic zone sediments, and this rate varies by at least two orders of magnitude across the contiguous U.S. (ConUS). However, we lack knowledge and predictive models of variation in hyporheic zone respiration rates.
The project addresses this knowledge gap through an effort that combines public data, machine learning and artificial intelligence (AI) modeling, and ICON science principles with an iterative model-experiment (ModEx) approach that leverages ConUS-scale crowdsourced sampling (Fig. 1). ICON principles are used in this project by Integrating AI modeling with biogeochemistry, Coordinating sample collection protocols to be consistent with numerous previous efforts across research groups, Openly and globally sharing the ideas and data from the associated research throughout the research lifecycle, and Networking with stakeholders throughout the research effort to understand how to increase the mutual benefits of data, knowledge, and model outcomes while avoiding situations in which data/knowledge is extracted from people/lands without providing value in return. Use of these principles in this project is facilitated via engagement with the ICON Science Cooperative (https://icon-science.pnnl.gov).
To learn more about ICON in ICON-ModEx, watch the recorded presentation from the May 2022 U.S. Department of Energy (DOE) Biological and Environmental Research (BER) Environmental Systems Science (ESS) Program Principal Investigators Meeting.
Existing Data and Model Improvement
To generate the data needed to fill the stated knowledge gap, the project used data from a previous crowdsourced sampling campaign led by WHONDRS to develop an ensemble of predictive AI models (Fig. 2). Examining the performance of the aggregated models revealed strong biases, especially towards high respiration rates (Fig. 2a). The model was used to predict respiration rates at unsampled locations across the ConUS with data from public databases (e.g., GLORICH; https://doi.org/10.1016/j.proeps.2014.08.005). Doing so revealed spatial clustering of high vs. low model uncertainty or bias (Fig. 2b). The project is attempting to improve the model (e.g., by reducing bias) while generating additional knowledge of which variables explain and potentially drive variation in hyporheic zone sediment aerobic respiration rates. To do this, unsampled locations are ranked in terms of their priority for additional sample collection and respiration rate estimation. Sites with the highest priority are those with the highest predictive uncertainty and with environmental characteristics beyond what was covered in the original WHONDRS sampling campaign.
Given a list of prioritized sites across the ConUS that changes monthly based on the AI models (Fig. 3), we are contacting potential collaborators near those sites and advertising openly. Collaborators began collecting samples in April 2022 (Fig. 4) and sampling is anticipated to continue until September 2023. For those willing to collaborate on sample collection, we will provide a free standardized sampling kit and protocol for sample collection. Sediments will be measured for respiration rate, molecular composition of organic matter, grain size, potential and expressed microbial metabolism, and concentrations of C, N, ions, Fe(II), and ATP. Adjacent surface water will be measured for C, N, and ion concentrations, dissolved oxygen, pH, and molecular composition of organic matter. These measurements were selected based on feedback from open discussions with the science community, with the goal of maximizing value for those within and beyond the core project team.
Scientific Questions and Outcomes
Primary outcomes from this study will be an improved data-driven model capable of predicting sediment respiration across the ConUS as well as knowledge about variables that explain and potentially cause variation in respiration rates. In addition, the data-driven model will be included within a process-based basin-scale model that integrates watershed hydrology and biogeochemistry. This model will be applied to the Yakima River Basin (YRB) in Washington State. The process-based YRB model that includes the AI-based respiration model will be used to evaluate how net basin-scale fluxes of C and N are influenced by variation in hyporheic zone respiration rates. These outcomes are guided by focusing on evaluating the following scientific questions:
- Can we use a ConUS-scale ICON-ModEx approach to continually improve the ability of AI-models to predict sediment respiration rates, including decreases in total uncertainty and decreases in how biased the uncertainty is towards high respiration rates?
- As we progressively add more data, especially in locations with environmental conditions beyond what was initially sampled, does the relative importance or ranking of each explanatory variable remain relatively stable, or does it fluctuate?
- Initial modeling indicated that mass spectrometry data can provide influential explanatory variables. Is it necessary to include these data and/or other molecular data types (e.g., microbial metagenomes) that cannot be readily inferred/mapped from existing databases?
- How do basin-scale fluxes of C and N in the YRB respond to predicted variation in hyporheic zone aerobic respiration rates?
Answering these questions and developing an improved predictive model provides value in a number of ways. For example, the knowledge generated from this study can guide hypothesis-driven research focused on mechanisms governing respiration across environmentally divergent hyporheic zones. Similar to the YRB model, the AI-based model can inform other models used to predict local-scale environmental quality and/or regional-to-global scale biogeochemical fluxes under current and future environmental conditions.
What is the value of getting engaged in the ICON-ModEx project?
- Help shape how this effort openly collaborates with you and other stakeholders. This project is based around an ongoing and iterative design process. You can help shape how we engage with stakeholders to meet your needs, other stakeholder needs, and remove barriers to collaboration and science opportunities that you and your teams/students might face.
- Join, guide, and/or help enhance educational opportunities for you, your teams/students, and other scientists. The project team can give a virtual presentation, with some hands-on elements, to your class. We talk about the science, ways to get involved, and career opportunities at national labs and industry. Also, by contributing samples and/or guidance on data collection, collaborator engagement, and/or modeling you will be enhancing the foundations for a future open-access short course focused on integration of artificial intelligence (AI) modeling with molecular and environmental data. That course can be adapted into other courses and training you lead or are otherwise engaged in, and could be an entry point for engaging in crowdsourced science.
- Help advance river corridor science through development of transferable knowledge and predictive models. River corridors serve major roles throughout the Earth system by modulating the quantity and quality of water used by human societies, being a habitat for myriad creatures, and contributing strongly to the future course of Earth’s climate via influences over elemental cycles. Our ability to advance towards a high quality future for humans and the environment depends, in part, on being able to predict the future of river corridors. To do so requires knowledge and models that transcend individual river corridors (i.e., that are transferable across river corridors). The ICON-ModEx project is a major contributor to achieving this transferability, but it can succeed only via deep engagement with and support from people like you.
- Generate data (for free) from your field site to support your publications and proposals. Data-generation costs are covered by the ICON-ModEx project. Data are generated in a standardized way allowing for data from your field site to be interoperable with data from other field sites. This allows you to put your field site in the context of diverse river corridor environments and pursue cross-site analyses. These open data could be used for manuscripts, proposals, or any other products you want to lead or contribute to. All data are being published on ESS-DIVE (https://data.ess-dive.lbl.gov/) with a first data package currently available: https://data.ess-dive.lbl.gov/view/doi:10.15485/1923689.
The first discussion for crowdsourced feedback occurred in December 2021. The second discussion occurred in February 2022. Crowdsourced sampling began in April 2022 and is planned through September 2023 in the contiguous United States. The first data package was published in January 2023 and will be updated as new data are generated. An update call is scheduled for June 2023.
If you are interested in sampling, please email email@example.com and we will discuss the site selection and scheduling logistics with you. The field time required is about an hour for a two-person team. If you are interested in engaging in the manuscript, classroom presentation, or other engagement opportunities, please also email firstname.lastname@example.org
Collaborative Manuscript Effort
Information and updates regarding the collaborative manuscript effort will be added here.
April 2023 Update: WHONDRS is excited to start a new collaborative manuscript effort with anyone who is interested in engaging. The manuscript will focus on sediment respiration and the ICON-ModEx process. Are you interested in helping build the storyboard, do analysis, interpret results, draft/edit text, or other parts of the manuscript process? We would love to hear from you. Reach out to email@example.com to get started. If you’re interested in engaging, we would suggest joining the ICON-ModEx update discussion June 9, 2023 8:00 am - 9:00 am PDT. Sign up at this google form and we will send you the initial storyboard and meeting details.
Funding for this project is provided by the United States Department of Energy (DOE) Biological and Environmental Research (BER) Environmental Systems Science (ESS) Program and the DOE Small Business Innovation Research (SBIR) Program.