October 6, 2022
Journal Article

Discovering Hidden Geothermal Signatures using Non-Negative Matrix Factorization with Customized k-means Clustering


Discovery of hidden geothermal resources is challenging. It requires the mining of large datasets with diverse data attributes representing subsurface hydrogeological and geothermal conditions. The commonly used play fairway analysis (PFA) approach typically relies on subject-matter expertise to analyze regional data to estimate geothermal settings and favorability. Here, we demonstrate an alternative approach based on machine learning (ML) to process a geothermal dataset of southwest New Mexico (SWNM). The study region includes low- and medium-temperature hydrothermal systems. Several of these systems are not well characterized because of insufficient existing data and limited past explorative work. This study discovers hidden patterns and relationships in the SWNM geothermal dataset to understand regional hydrothermal conditions and energy-production favorability better. This understanding is obtained by applying an unsupervised machine learning algorithm based on non-negative matrix factorization coupled with customized k-means clustering (NMFk). NMFk can automatically identify (1) hidden signatures characterizing analyzed datasets, (2) the optimal number of these signatures, (3) dominant data attributes associated with each signature, and (4) the spatial distribution of the extracted signatures. Here, NMFk is applied to analyze 18 geological, geophysical, hydrogeological, and geothermal attributes at 44 locations in SWNM. Using NMFk, we find data patterns and identify the spatial associations of hydrothermal signatures with four physiographic provinces in SWNM (Colorado Plateau, Mogollon-Datil volcanic field, Basin and Range, and the Rio Grande rift). The ML algorithm extracted five hydrothermal signatures in the SWNM datasets that differentiate between low- and medium-temperature hydrothermal systems in different provinces. The algorithm also identified that the Rio Grande rift and northern Mogollon-Datil volcanic field are the most favorable physiographic provinces for future geothermal resource discovery. NMFk also identified critical attributes to identify medium-temperature hydrothermal systems in the study area. The resulting NMFk model can be applied to predict geothermal conditions and their uncertainties at new SWNM locations based on limited data from unexplored regions. The code to execute the performed analyses as well as the corresponding data can be found at https://github.com/SmartTensors/GeoThermalCloud.jl.

Published: October 6, 2022


Vesselinov V.V., B. Ahmmed, M. Mudunuru, J.D. Pepin, E.R. Burns, D.L. Siler, and S. Karra, et al. 2022. Discovering Hidden Geothermal Signatures using Non-Negative Matrix Factorization with Customized k-means Clustering. Geothermics 106. PNNL-SA-167650. doi:10.1016/j.geothermics.2022.102576

Research topics