December 30, 2019
Conference Paper

Ground-Truth Prediction to Accelerate Soft-Error Impact Analysis for Iterative Methods

Abstract

Understanding the impact of soft errors on applications can be expensive. Often, it requires an extensive error injection campaign involving numerous runs of the full application in the presence of errors. In this paper, we present a novel approach to arriving at the ground truth--the true impact of an error on the final output--for iterative methods by observing a small number of iterations to learn deviations between normal and error-impacted execution. We develop a machine learning based predictor for three iterative methods to generate ground-truth results without running them to completion for every error injected. We demonstrate that this approach achieves greater accuracy than alternative prediction strategies, including three existing soft error detection strategies. We demonstrate the effectiveness of the ground truth prediction model in evaluating vulnerability and the effectiveness of soft error detection strategies in the context of iterative methods.

Revised: February 12, 2021 | Published: December 30, 2019

Citation

Mutlu B., G. Kestor, A. Cristal, O. Unsal, and S. Krishnamoorthy. 2019. Ground-Truth Prediction to Accelerate Soft-Error Impact Analysis for Iterative Methods. In IEEE 26TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HiPC 2019), December 17-20, 2019, Hyderabad, India, 333-344. Los Alamitos, California:IEEE Computer Society. PNNL-SA-148074. doi:10.1109/HiPC.2019.00048