TY - JOUR
T1 - Robust hierarchical games of linear discrete-time systems based on off-policy model-free reinforcement learning
AU - Ma, Xiao
AU - Yuan, Yuan
N1 - Publisher Copyright:
© 2024
PY - 2024/5
Y1 - 2024/5
N2 - An off-policy model-free reinforcement learning (RL) algorithm is proposed for a robust hierarchical game while considering incomplete information and input constraints. The robust hierarchical game exhibits characteristics of a Stackelberg–Nash (SN) game, where equilibrium points are designated as Stackelberg–Nash–Saddle equilibrium (SNE) points. An off-policy method is employed for the RL algorithm, addressing input constraints by using excitation input instead of real-time update polices as control inputs. Moreover, a model-free method is implemented for the off-policy RL algorithm, accounting for the challenge posed by incomplete information. The goal of this paper is to develop an off-policy model-free RL algorithm to obtain approximate SNE polices of the robust hierarchical game with incomplete information and input constraints. Furthermore, the convergence and effectiveness of the off-policy model-free RL algorithm are guaranteed by proving the equivalence of Bellman equation between nominal SNE policies and approximate SNE policies. Finally, a simulation is provided to verify the advantage of the developed algorithm.
AB - An off-policy model-free reinforcement learning (RL) algorithm is proposed for a robust hierarchical game while considering incomplete information and input constraints. The robust hierarchical game exhibits characteristics of a Stackelberg–Nash (SN) game, where equilibrium points are designated as Stackelberg–Nash–Saddle equilibrium (SNE) points. An off-policy method is employed for the RL algorithm, addressing input constraints by using excitation input instead of real-time update polices as control inputs. Moreover, a model-free method is implemented for the off-policy RL algorithm, accounting for the challenge posed by incomplete information. The goal of this paper is to develop an off-policy model-free RL algorithm to obtain approximate SNE polices of the robust hierarchical game with incomplete information and input constraints. Furthermore, the convergence and effectiveness of the off-policy model-free RL algorithm are guaranteed by proving the equivalence of Bellman equation between nominal SNE policies and approximate SNE policies. Finally, a simulation is provided to verify the advantage of the developed algorithm.
KW - Model-free
KW - Off-policy
KW - Reinforcement learning
KW - Robust hierarchical game
UR - http://www.scopus.com/inward/record.url?scp=85189675479&partnerID=8YFLogxK
U2 - 10.1016/j.jfranklin.2024.106711
DO - 10.1016/j.jfranklin.2024.106711
M3 - 文章
AN - SCOPUS:85189675479
SN - 0016-0032
VL - 361
JO - Journal of the Franklin Institute
JF - Journal of the Franklin Institute
IS - 7
M1 - 106711
ER -