Robust hierarchical games of linear discrete-time systems based on off-policy model-free reinforcement learning

Xiao Ma, Yuan Yuan

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

An off-policy model-free reinforcement learning (RL) algorithm is proposed for a robust hierarchical game while considering incomplete information and input constraints. The robust hierarchical game exhibits characteristics of a Stackelberg–Nash (SN) game, where equilibrium points are designated as Stackelberg–Nash–Saddle equilibrium (SNE) points. An off-policy method is employed for the RL algorithm, addressing input constraints by using excitation input instead of real-time update polices as control inputs. Moreover, a model-free method is implemented for the off-policy RL algorithm, accounting for the challenge posed by incomplete information. The goal of this paper is to develop an off-policy model-free RL algorithm to obtain approximate SNE polices of the robust hierarchical game with incomplete information and input constraints. Furthermore, the convergence and effectiveness of the off-policy model-free RL algorithm are guaranteed by proving the equivalence of Bellman equation between nominal SNE policies and approximate SNE policies. Finally, a simulation is provided to verify the advantage of the developed algorithm.

Original languageEnglish
Article number106711
JournalJournal of the Franklin Institute
Volume361
Issue number7
DOIs
StatePublished - May 2024

Keywords

  • Model-free
  • Off-policy
  • Reinforcement learning
  • Robust hierarchical game

Fingerprint

Dive into the research topics of 'Robust hierarchical games of linear discrete-time systems based on off-policy model-free reinforcement learning'. Together they form a unique fingerprint.

Cite this