August 8, 2025
Conference Paper

DS-LLM: Leveraging Dynamical Systems to Enhance Both Training and Inference of Large Language Models

Abstract

The training of large language models (LLMs) faces critical computational cost challenges, hindering their scaling toward AGI and broader adoption. With model sizes doubling approximately every 3.4 months and training costs surging from $64 million for GPT-4 in 2020 to $191 million for Gemini Ultra in 2023, the economic strain is unsustainable. While optimizations like quantization provide incremental improvements, they fail to address the fundamental bottleneck. In this work, we propose DS-LLM, a novel framework leveraging dynamic system (DS)- based machines, which exploit Natural Annealing to instantaneously converge to minimum energy states, enabling orders-of-magnitude gains in efficiency. Unlike traditional methods, DS-LLM maps LLM components to optimization problems solvable via Hamiltonian configurations and utilizes DS machines’ continuous electric current flow for hardware-native gradient descent during training. We mathematically demonstrate the equivalence between existing LLMs and DSLLMs and offer a viable approach to build a DS-LLM from a trained conventional LLM. Evaluations using different sizes of models showcase orders of magnitudes speedup and energy reduction on both training and inference, while maintaining consistent accuracy. Furthermore, we provide detail analysis and discussion on the potential challenges and solutions of this emerging computing diagram, aiming to provide a solid foundation for future research.

Published: August 8, 2025

Citation

Song R., C. Liu, C. Wu, A. Li, D. Liu, Y. Wu, and T. Geng. 2025. DS-LLM: Leveraging Dynamical Systems to Enhance Both Training and Inference of Large Language Models. In 13th International Conference on Learning Representations (ICLR 2025), April 24-28, 2025, Singapore, 65379-65394. Appleton, Wisconsin:International Conference on Learning Representations (ICLR). PNNL-SA-208428.

Research topics