December 30, 2025
Conference Paper

Demystifying the Resilience of Large Language Models: An End-to-End Perspective

Abstract

Deep neural networks are known to be resilient to random bit-wise faults in their parameters. However, this resilience has primarily been established through evaluations of classification models. The extent to which this claim holds for large-language models remains underexplored. In this work, we conduct an extensive measurement study on the impact of random bitwise faults in commercial-scale language models. We perform an in-depth analysis of the resulting generation outputs. We first expose that these language models are not truly resilient to random bit-flips. While aggregate metrics such as accuracy may suggest resilience, an in-depth inspection of the generated outputs shows significant degradation in text quality. Our analysis also shows that tasks requiring more complex reasoning suffer more from performance and quality degradation. Moreover, we extend our analysis to models with augmented reasoning capabilities, such as Chain-of-Thought or Mixture of Experts architectures, and characterize their failure scenarios under random bit-flips.

Published: December 30, 2025

Citation

Sun Y., Z. Coalson, S. Chen, H. Liu, Z. Zhang, S. Hong, and B. Fang, et al. 2025. Demystifying the Resilience of Large Language Models: An End-to-End Perspective. In The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC25), 1127-1144. PNNL-SA-211015. doi:10.1145/3712285.3759803