December 3, 2025
Journal Article
Self-diagnosis of model suitability for continuous measurementsof stream-dissolved organic carbon derived from in situ UV–visible spectroscopy
Abstract
Application of high-frequency monitoring of dissolved organic carbon (DOC) is difficult in instances wheretraining datasets are challenging to develop (e.g., remote locations) and the relationship between optical fea-tures and DOC concentration changes due to environmental or landscape shifts (e.g., climate or land-usechange). We developed and compared three partial least squares (PLS) models using in situ water level measure-ments, conductivity, and UV–Vis spectral attenuation to predict DOC. Two site-specific models were developedusing data from a hillslope-dominated forest or a low-relief wetland-pond-dominated stream catchment. Thethird model, using data from both sites, exhibited the best performance (DOC range=4–15.5 mg C L 1,mean=8.38 mg C L 1, training RMSE=0.34 mg C L 1, internal validation RMSE=0.50 mg C L 1, externalvalidation RMSE=2.43 mg C L 1). We further demonstrate using PLS model statistics to monitor performanceand elucidate when and how models should be updated. These statistics, Hotelling’sT2and squared predictionerrors, are useful consistency checks for the predictions made and detect underlying inconsistencies that, ifundetected, can reduce the robustness of DOC prediction. For example, via the T2statistic, we identified thesummer–autumn transition as a period when DOC composition differed from what was represented in the train-ing dataset. We also determined that elevated SUVA254values contributed to the overall bias observed in predic-tions made during the subsequent year as part of the external validation. This enabled the application of a biascorrection that reduced the RMSE from 2.43 to 0.89 mg C L 1. The method presented here could be applied tofuture monitoring programs enabling model updates to monitor DOCfluxes accurately from optical datasets(e.g., attenuance orfluorescence) in the face of developing datasets in remote locations or environmentalchange. Implementation of this approach may also identify possible regime shifts or landscape and hydrologicchange associated with climate and other environmental changes relevant to terrestrial to aquatic fluxes.Published: December 3, 2025