AbstractDistributed learning can enable scalable and effective decision making in numerous complex cyber-physical systems such as smart transportation, robotics swarm, power systems, etc. However, the stability of the system is usually not guaranteed in most existing learning paradigms; and this limitation can hinder the wide deployment of machine learning in decision making of safety-critical systems. This paper presents a stability guaranteed distributed reinforcement learning (SGDRL) framework for interconnected linear subsystems, without knowing the subsystem models. While the learning process requires data from a peer-to-peer (p2p) communication architecture, the control implementation of each subsystem is only based on its local state. The stability of the interconnected subsystems will be ensured by a diagonally dominant eigenvalue condition, which will then be used in a model-free RL algorithm to learn the feedback control gains. The RL algorithm structure follows an off-policy iterative framework, with interleaved policy evaluation and policy update steps. We numerically validate our theoretical results by performing simulations on four interconnected sub-systems.
Revised: December 31, 2020 | Published: November 1, 2021