February 2, 2026
Journal Article

Employing Machine Learning for New Particle Formation Identification and Mechanistic Analysis: Insights from a Six-Year Observational Study in the Southern Great Plains

Abstract

We present a supervised machine learning (ML) framework to automatically identify new particle formation (NPF) events and analyze key atmospheric factors associated with their occurrence and growth. We applied ML to detect NPF events using start time and particle concentrations across size ranges, while identifying atmospheric variables including ambient temperature, relative humidity, solar radiation intensity (SRI), wind speed, wind direction, boundary layer height, total organics, sulfate, nitrate, total surface area concentration, sulfur dioxide, and turbulent kinetic energy (TKE). We analyzed a 6-year data set from the Atmospheric Radiation Measurement at the Southern Great Plains (SGP) site in Oklahoma, USA. Using long-term ground-based measurements, we identified NPF events and applied Random Forest Classifiers, which achieved 90%–95% prediction accuracy. Feature importance analysis highlighted SRI, relative humidity, and ambient temperature as the most influential variables, contributing normalized importances of 28%, 17%, and 10%. Partial Dependence Plots (PDPs) indicated that higher SRI and lower relative humidity were critical in promoting NPF formation at SGP. Seasonally, NPF events were more frequent in winter (42.1%) and spring (35.5%), and least in summer (4.0%). Particle growth rates also exhibited a seasonal variation, with the lowest in winter (below 2 nm hr-1) and highest in late spring and early summer (exceeding 5 nm hr-1). Temperature, turbulent kinetic energy, and aerosol properties were the primary factors of growth rate variability. This study advances predictive modeling of NPF, offers insights for future campaign deployments, and demonstrates the effectiveness of ML in understanding the formation and growth of atmospheric aerosols.

Published: February 2, 2026

Citation

Hao W., M. Mehra, G. Budhwani, T. Chakraborty, F. Mei, and Y. Wang. 2026. Employing Machine Learning for New Particle Formation Identification and Mechanistic Analysis: Insights from a Six-Year Observational Study in the Southern Great Plains. Journal of Geophysical Research: Atmospheres 131, no. 1:e2024JD043116. PNNL-SA-204367. doi:10.1029/2024JD043116

Research topics