Improving Knowledge and Predictions of Hyporheic Zone Respiration via Continental-Scale Iterative ICON-ModEx Science
ICON-ModEx PNNL Leadership Team*: Mikayla Borton, Xingyuan Chen, Amy Goldman, Emily Rexer, Tim Scheibe, James Stegen
ICON-ModEx Parallel Works Leadership Team*: Stefan Gary and Michael Wilde
*Names are listed in alphabetical order. Reach out to firstname.lastname@example.org to get engaged.
River corridors are a vital aspect of the Earth system, contribute to global biogeochemical cycles, and influence water and habitat quality with local to regional implications. Within river corridors, the microbes in sediments immediately below the river bed (i.e., in the hyporheic zone) are often a primary control over river corridor biogeochemical function. These contributions vary tremendously across river corridors whereby hyporheic zones contribute from 4-96% of stream respiration. In particular, aerobic respiration rates are a dominant biogeochemical process in river corridors and in hyporheic zone sediments, this rate varies by at least two orders of magnitude across the contiguous U.S. (ConUS). However, we lack knowledge and predictive models of variation in hyporheic zone respiration rates.
This project addresses this knowledge gap through an effort that combines public data, artificial intelligence (AI) modeling, and ICON science principles with an iterative model-experiment (ModEx) approach that leverages ConUS-scale crowdsourced sampling (Fig. 1). ICON principles are used in this project by Integrating AI modeling with biogeochemistry, Coordinating sample collection protocols to be consistent with numerous previous efforts across research groups, Openly and globally sharing the ideas and data from the associated research throughout the research lifecycle, and Networking with stakeholders throughout the research effort to understand how to increase the mutual utility of data, knowledge, and model outcomes while avoiding situations in which data/knowledge is extracted from people/lands without providing value in return. Use of these principles in this project is facilitated via engagement with the ICON Science Cooperative (https://icon-science.pnnl.gov).
Existing Data and Model Improvement
To generate the data needed to fill the stated knowledge gap, the project used data from a previous crowdsourced sampling campaign led by the Worldwide Hydrobiogeochemical Observation Network for Dynamic Rivers (WHONDRS; https://whondrs.pnnl.gov) to develop an ensemble of predictive AI models (Fig. 2). Examining the performance of the aggregated models revealed strong biases, especially towards high respiration rates (Fig. 2a). The model was used to predict respiration rates at unsampled locations across the ConUS with data from public databases (e.g., GLORICH). Doing so revealed spatial clustering of high vs. low model uncertainty or bias (Fig. 2b). The project is attempting to improve the model (e.g., by reducing bias) while generating additional knowledge of which variables explain and potentially drive variation in hyporheic zone sediment aerobic respiration rates. To do this, unsampled locations are ranked in terms of their priority for additional sample collection and respiration rate estimation. Sites with the highest priority are those with the highest predictive uncertainty and with environmental characteristics beyond what was covered in the original WHONDRS sampling campaign.
Given a list of prioritized sites across the ConUS (Fig. 3), we are contacting potential collaborators near those sites. For those willing to collaborate on sample collection, we will provide a free standardized sampling kit and protocol for sample collection. Collaboration is open globally and is encouraged for sample collection, feedback, data collaboration, and anything else. The team can be reached at WHONDRS@pnnl.gov. Sediments will be measured for respiration rate, molecular composition of organic matter, grain size, potential and expressed microbial metabolism, and concentrations of C, N, and ions. Adjacent surface water will be measured for C, N and ion concentrations, dissolved oxygen, pH, and potential and expressed microbial metabolism. These measurements were selected based on feedback from open calls with the science community, with the goal of maximizing value for those within and beyond the core project team.
Scientific Questions and Outcomes
Primary outcomes from this study will be an improved data-driven model capable of predicting sediment respiration across the ConUS as well as knowledge about variables that explain and potentially cause variation in respiration rates. In addition, the data-driven model will be included within a process-based basin-scale model that integrates watershed hydrology and biogeochemistry. This model will be applied to the Yakima River Basin (YRB) in Washington State. The process-based YRB model that includes the AI-based respiration model will be used to evaluate how net basin-scale fluxes of C and N are influenced by variation in hyporheic zone respiration rates. These outcomes are guided by focusing on evaluating the following scientific questions:
- Can we use a ConUS-scale ICON-ModEx approach to continually improve the ability of AI-models to predict sediment respiration rates, including decreases in total uncertainty and decreases in how biased the uncertainty is towards high respiration rates?
- As we progressively add more data, especially in locations with environmental conditions beyond what was initially sampled, does the relative importance or ranking of each explanatory variable remain relatively stable, or does it fluctuate?
- Initial modeling indicated that mass spectrometry data can provide influential explanatory variables. Is it necessary to include these data and/or other data types (e.g., microbial metagenomes) that cannot be readily inferred/mapped from existing databases?
- How do basin-scale fluxes of C and N in the YRB respond to predicted variation in hyporheic zone aerobic respiration rates?
Answering these questions and developing an improved predictive model provides value in a number of ways. For example, the knowledge generated from this study can guide hypothesis-driven research focused on mechanisms governing respiration across environmentally divergent hyporheic zones. Similar to the YRB model, the AI-based model can inform other models used to predict local-scale environmental quality and/or regional-to-global scale biogeochemical fluxes under current and future environmental conditions.
What is the value of getting engaged in the ICON-ModEx project?
- Help shape how this effort openly collaborates with you and other stakeholders. This project is based around an ongoing and iterative design process. You can help shape how we engage with stakeholders to meet your needs, other stakeholder needs, and remove barriers to collaboration and science opportunities that you and your teams/students might face.
- Guide and/or help enhance educational opportunities for you, your teams/students, and other scientists. By contributing samples and/or guidance on data collection, collaborator engagement, and/or modeling you will be enhancing the foundations for a future open-access short course focused on integration of artificial intelligence (AI) modeling with molecular and environmental data. That course can be adapted into other courses and training you lead or are otherwise engaged in, and could be an entry point for engaging in crowdsourced science.
- Help advance river corridor science through development of transferable knowledge and predictive models. River corridors serve major roles throughout the Earth system by modulating the quantity and quality of water used by human societies, being a habitat for myriad creatures, and contributing strongly to the future course of Earth’s climate via influences over elemental cycles. Our ability to advance towards a high quality future for humans and the environment depends, in part, on being able to predict the future of river corridors. To do so requires knowledge and models that transcend individual river corridors (i.e., that are transferable across river corridors). The ICON-ModEx project is a major contributor to achieving this transferability, but it can succeed only via deep engagement with and support from people like you.
- Generate data (for free) from your field site to support your publications and proposals. Data-generation costs are covered by the ICON-ModEx project. Data are generated in a standardized way allowing for data from your field site to be interoperable with data from other field sites. This allows you to put your field site in the context of diverse river corridor environments and pursue cross-site analyses. These open data could be used for manuscripts, proposals, or any other products you want to lead or contribute to. All data will be published on ESS-DIVE (https://data.ess-dive.lbl.gov/).
The first discussion for crowdsourced feedback occurred in December 2021. The second discussion occurred in February 2022. Crowdsourced sampling began in April 2022 and is planned through September 2022 in the continental United States. WHONDRS supplies the sampling materials for free, covers shipping in both directions, and makes the data publicly available. The field time required is about an hour for a two-person team.
If you are interested in collecting samples, you can explore an interactive map of sampling locations with model predictions at https://tinyurl.com/IM22-PredictionMap2. The AI model has indicated the sites with purple star markers are higher priority. If you do not find a site on the map that you would like to sample, please let us know. There are alternative approaches for adding additional locations (e.g., your existing field sites). If you are interested in sampling, please email email@example.com and let us know a preferred site(s) and time window(s) via this google form.
Funding for this project is provided by the United States Department of Energy (DOE) Biological and Environmental Research (BER) Earth Systems Science (ESS) Program and the DOE Small Business Innovation Research (SBIR) Program.