Advanced Causal Analysis Identifies Key Features Governing Secondary Organic Aerosols
Information transfer based causal analysis outperforms traditional approaches in identifying key features affecting secondary organic aerosols
The Science
Chemical interactions of natural biogenic organic gases emitted from forests with anthropogenic species in the atmosphere cause the formation of thousands of tiny secondary organic aerosol (SOA) particles that scatter radiation and seed clouds affecting the Earth’s energy balance and hydrological cycle. Isoprene epoxydiol SOA (IEPOX-SOA) is one of the most complex SOA types formed by interactions between IEPOX gases and particle-phase sulfate, particle liquid water, and acidity. Inferring causal relations between the various chemistry and meteorological features that govern IEPOX-SOA from field experiments is complicated since correlations between measured variables do not necessarily imply causality. By analyzing time series outputs from a detailed regional model, researchers showed that information transfer based causal analysis framework successfully identifies the dominant features affecting IEPOX-SOA. Since these causal relations are coded as mathematical equations within the aqueous multiphase chemistry module in the regional Weather Research and Forecasting Model coupled to chemistry (WRF-Chem), these IEPOX-SOA processes are known a priori and are manifested in the time series outputs of the WRF-Chem model. The causal analysis outperforms random forest and correlation analyses applied over the same dataset in identifying the key features affecting IEPOX-SOA.
The Impact
This study provides the first evidence that an advanced causal analysis that uses the Koopman framework with an information transfer approach could be used to gain insights into the direct and indirect causal relations between key variables of interest affecting IEPOX-SOA. By comparing the information transfer derived by causal analysis with the importance of features contributing to IEPOX-SOA predictions (feature importance) determined from random forest and correlation analyses on the same WRF-Chem dataset with known causality, researchers show that the causal analysis outperforms the other two feature attribution approaches. By assessing the utility of causal approaches on a well characterized system with a high signal-to-noise ratio such as WRF-Chem model outputs, researchers took the first step toward their future application to field measurements. The work has tremendous implications for the analyses of measurements and models, could be used to understand unknown processes that might affect variables of interest, and could likely be applied in diverse domains (e.g., climate, air quality, and human health) to identify unknown causal relations.
Summary
Researchers applied an information transfer measure coupled with the Koopman operator framework to infer causal relations between IEPOX-SOA and different chemistry and meteorological variables derived from detailed regional model predictions over the Amazon rainforest. IEPOX-SOA represents one of the most complex SOA formation pathways. Since the regional model captures the known relations of IEPOX-SOA with different chemistry and meteorological features, their simulated time series implicitly include their causal relations. Researchers showed that a causal model successfully infers the known major causal relations between total particle-phase 2-methyl tetrols (the dominant component of IEPOX-SOA over the Amazon) and input features. The causal approach identifies the dominant features affecting IEPOX-SOA in two contrasting regimes: near the surface and the upper troposphere, where the physical and chemical processes governing IEPOX-SOA are different, as shown in their previous study. The causal analysis identified particle sulfate and water as the key features governing IEPOX-SOA near the surface, while it identified 2-methyltetrol gases formed by surface/plant biochemistry as the most important feature at high altitudes in the upper troposphere. In contrast, random forest and correlation analyses attributed organic aerosols and IEPOX gas as the most important features that are correlated with IEPOX-SOA. The information transfer analyses showed that organic aerosols and IEPOX gas have low direct information transfer to IEPOX-SOA but indirectly transfer information to IEPOX-SOA via other features. Researchers provided the first proof of concept that the application of the causal model better identifies direct and indirect causal relations compared to correlation and random forest analyses performed over the same dataset. This causal analysis framework could be used to diagnose the role of unknown processes affecting a variable of interest from analyses of time series data in the field.
Computational resources were provided by the Environmental Molecular Sciences Laboratory and Pacific Northwest National Laboratory (PNNL) Research Computing.
PNNL Contact
Manish Shrivastava, Pacific Northwest National Laboratory, manishkumar.shrivastava@pnnl.gov
Funding
This research is primarily supported by the U.S. Department of Energy (DOE), Office of Science Biological and Environmental Research program, Early Career Research Program at PNNL. Computational resources for the simulations were provided by the Environmental Molecular Sciences Laboratory (a DOE Office of Science user facility sponsored by the Biological and Environmental Research program located at PNNL) and the PNNL Research Computing facilities.
Related link
Published: June 7, 2024
Sinha, S., Sharma, H. & Shrivastava, M. Application of advanced causal analyses to identify processes governing secondary organic aerosols. Sci Rep 14, 10718 (2024). https://doi.org/10.1038/s41598-024-59887-7