Jorge Cortés
Professor
Cymer Corporation Endowed Chair
Certifying stability of reinforcement learning
policies using generalized Lyapunov functions
K. Long, J. Cortés, N. Atanasov
Advances in Neural Information Processing Systems, eds.
N. Chen, M. Ghassemi, P. Koniusz, R. Pascanu, H.-T. Lin, vol. 38, Curran Associates, 2025
Abstract
Establishing stability certificates for closed-loop systems under
reinforcement learning (RL) policies is essential to move beyond
empirical performance and offer guarantees of system
behavior. Classical Lyapunov methods require a strict stepwise
decrease in the Lyapunov function, but such certificates are difficult
to construct for learned policies. The RL value function is a natural
candidate, but how to adapt it for this purpose is not well
understood. To gain intuition, we first study the linear quadratic
regulator (LQR) problem and make two key observations. First, a
Lyapunov function can be obtained from the value function of an LQR
policy by augmenting it with a residual term related to the system
dynamics and stage cost. Second, the classical Lyapunov decrease
requirement can be relaxed to a generalized Lyapunov condition
requiring only decrease on average over multiple time steps. Using
this intuition, we consider the nonlinear setting and formulate an
approach to learn generalized Lyapunov functions by augmenting RL
value functions with neural network residual terms. Our approach
successfully certifies the stability of RL policies trained on
Gymnasium and DeepMind Control benchmarks. We also extend our method
to jointly train neural controllers and stability certificates using a
multi-step Lyapunov loss, resulting in larger certified inner
approximations of the region of attraction compared to the classical
Lyapunov approach. Overall, our formulation enables stability
certification for a broad class of systems with learned policies by
making certificates easier to construct, thereby bridging classical
control theory and modern learning-based methods.
pdf
Mechanical and Aerospace Engineering,
University of California, San Diego
9500 Gilman Dr,
La Jolla, California, 92093-0411
Ph: 1-858-822-7930
Fax: 1-858-822-3107
cortes at ucsd.edu
Skype id:
jorgilliyo