July 24, 2025
Journal Article

Explaining drivers of housing prices with nonlinear hedonic regressions

Abstract

Linear regression models are commonly used for hedonic regression, despite machine learning models often achieving higher prediction accuracy and being better equipped to handle nonlinear relationships. The perception that machine learning models are black boxes is a major reason for this preference. However, the emergence of interpretable machine learning techniques potentially offers a way to more transparently incorporate machine learning-based regression models for hedonic analysis. To explore this potential, we constructed linear and ANN-based regression models to predict housing sales prices in the Baltimore Metropolitan Areas using property structural (e.g., homes size) and spatial attributes (e.g., distance to city center) as predictors. Our accuracy assessment showed that the ANN model achieved higher prediction accuracy than the linear model. We then implemented ANOVA for the linear model and Sobol sensitivity analysis for the ANN model. Our results demonstrated that ANN explained a larger portion of the housing sales price variation compared to the linear model. We used the Sobol sensitivity analysis results to inform a Partial Dependence Plot (PDP) analysis, allowing us to quantify the complex relationship between each predictor and housing sales price while considering second-order interaction effects between predictors. Compared to the linear model, the PDP analysis revealed more complex and realistic relationships. Our study shows that Sobol sensitivity analysis combined with PDP analysis provides valuable insights into the behavior of the housing market across various predictors. These insights can be utilized by various stakeholders, including home buyers, real estate developers, and policy makers. Future research should consider higher-order interactions in Sobol sensitivity analysis and conduct more comprehensive uncertainty quantification to gain a better understanding of the models. Our study demonstrates the potential of machine learning-based regression models in hedonic analysis and the value of utilizing advanced techniques for sensitivity analysis and visualization to gain insights into the relationships between input features and output prediction.

Published: July 24, 2025

Citation

Wan H., P. Roy Chowdhury, J.J. Yoon, P. Bhaduri, V. Srikrishnan, D.R. Judi, and W.B. Daniel. 2025. Explaining drivers of housing prices with nonlinear hedonic regressions. Machine Learning with Applications 21:Art. No. 100707. PNNL-SA-185736. doi:10.1016/j.mlwa.2025.100707