Game-Based Backstepping Design for Strict-Feedback Nonlinear Multi-Agent Systems Based on Reinforcement Learning

Jia Long; Dengxiu Yu; Guoxing Wen; Li Li; Zhen Wang; C. L.Philip Chen

doi:10.1109/TNNLS.2022.3177461

Game-Based Backstepping Design for Strict-Feedback Nonlinear Multi-Agent Systems Based on Reinforcement Learning

Jia Long, Dengxiu Yu, Guoxing Wen, Li Li, Zhen Wang, C. L.Philip Chen

光电与智能研究院

科研成果: 期刊稿件 › 文章 › 同行评审

30 引用（Scopus）

摘要

In this article, the game-based backstepping control method is proposed for the high-order nonlinear multi-agent system with unknown dynamic and input saturation. Reinforcement learning (RL) is employed to get the saddle point solution of the tracking game between each agent and the reference signal for achieving robust control. Specifically, the approximate optimal solution of the established Hamilton-Jacobi-Isaacs (HJI) equation is obtained by policy iteration for each subsystem, and the single network adaptive critic (SNAC) architecture is used to reduce the computational burden. In addition, based on the separation operation of the error term from the derivative of the value function, we achieve the different proportions of the two agents in the game to realize the regulation of the final equilibrium point. Different from the general use of the neural network for system identification, the unknown nonlinear dynamic term is approximated based on the state difference obtained by the command filter. Furthermore, a sufficient condition is established to guarantee that the whole system and each subsystem included are uniformly ultimately bounded. Finally, simulation results are given to show the effectiveness of the proposed method.

源语言	英语
文章编号	3177461
页（从-至）	817-830
页数	14
期刊	IEEE Transactions on Neural Networks and Learning Systems
卷	35
期	1
DOI	https://doi.org/10.1109/TNNLS.2022.3177461
出版状态	已出版 - 1 1月 2024

访问文件

10.1109/TNNLS.2022.3177461

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{a96ee66c2afa4ebea76a49bf32c7ede6,

title = "Game-Based Backstepping Design for Strict-Feedback Nonlinear Multi-Agent Systems Based on Reinforcement Learning",

abstract = "In this article, the game-based backstepping control method is proposed for the high-order nonlinear multi-agent system with unknown dynamic and input saturation. Reinforcement learning (RL) is employed to get the saddle point solution of the tracking game between each agent and the reference signal for achieving robust control. Specifically, the approximate optimal solution of the established Hamilton-Jacobi-Isaacs (HJI) equation is obtained by policy iteration for each subsystem, and the single network adaptive critic (SNAC) architecture is used to reduce the computational burden. In addition, based on the separation operation of the error term from the derivative of the value function, we achieve the different proportions of the two agents in the game to realize the regulation of the final equilibrium point. Different from the general use of the neural network for system identification, the unknown nonlinear dynamic term is approximated based on the state difference obtained by the command filter. Furthermore, a sufficient condition is established to guarantee that the whole system and each subsystem included are uniformly ultimately bounded. Finally, simulation results are given to show the effectiveness of the proposed method.",

keywords = "Game-based backstepping, high-order multi-agent system, neural network (NN), reinforcement learning (RL), tracking game",

author = "Jia Long and Dengxiu Yu and Guoxing Wen and Li Li and Zhen Wang and Chen, {C. L.Philip}",

note = "Publisher Copyright: {\textcopyright} 2012 IEEE.",

year = "2024",

month = jan,

day = "1",

doi = "10.1109/TNNLS.2022.3177461",

language = "英语",

volume = "35",

pages = "817--830",

journal = "IEEE Transactions on Neural Networks and Learning Systems",

issn = "2162-237X",

publisher = "IEEE Computational Intelligence Society",

number = "1",

}

TY - JOUR

T1 - Game-Based Backstepping Design for Strict-Feedback Nonlinear Multi-Agent Systems Based on Reinforcement Learning

AU - Long, Jia

AU - Yu, Dengxiu

AU - Wen, Guoxing

AU - Li, Li

AU - Wang, Zhen

AU - Chen, C. L.Philip

PY - 2024/1/1

Y1 - 2024/1/1

N2 - In this article, the game-based backstepping control method is proposed for the high-order nonlinear multi-agent system with unknown dynamic and input saturation. Reinforcement learning (RL) is employed to get the saddle point solution of the tracking game between each agent and the reference signal for achieving robust control. Specifically, the approximate optimal solution of the established Hamilton-Jacobi-Isaacs (HJI) equation is obtained by policy iteration for each subsystem, and the single network adaptive critic (SNAC) architecture is used to reduce the computational burden. In addition, based on the separation operation of the error term from the derivative of the value function, we achieve the different proportions of the two agents in the game to realize the regulation of the final equilibrium point. Different from the general use of the neural network for system identification, the unknown nonlinear dynamic term is approximated based on the state difference obtained by the command filter. Furthermore, a sufficient condition is established to guarantee that the whole system and each subsystem included are uniformly ultimately bounded. Finally, simulation results are given to show the effectiveness of the proposed method.

AB - In this article, the game-based backstepping control method is proposed for the high-order nonlinear multi-agent system with unknown dynamic and input saturation. Reinforcement learning (RL) is employed to get the saddle point solution of the tracking game between each agent and the reference signal for achieving robust control. Specifically, the approximate optimal solution of the established Hamilton-Jacobi-Isaacs (HJI) equation is obtained by policy iteration for each subsystem, and the single network adaptive critic (SNAC) architecture is used to reduce the computational burden. In addition, based on the separation operation of the error term from the derivative of the value function, we achieve the different proportions of the two agents in the game to realize the regulation of the final equilibrium point. Different from the general use of the neural network for system identification, the unknown nonlinear dynamic term is approximated based on the state difference obtained by the command filter. Furthermore, a sufficient condition is established to guarantee that the whole system and each subsystem included are uniformly ultimately bounded. Finally, simulation results are given to show the effectiveness of the proposed method.

KW - Game-based backstepping

KW - high-order multi-agent system

KW - neural network (NN)

KW - reinforcement learning (RL)

KW - tracking game

UR - http://www.scopus.com/inward/record.url?scp=85131731782&partnerID=8YFLogxK

U2 - 10.1109/TNNLS.2022.3177461

DO - 10.1109/TNNLS.2022.3177461

M3 - 文章

AN - SCOPUS:85131731782

SN - 2162-237X

VL - 35

SP - 817

EP - 830

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

IS - 1

M1 - 3177461

ER -

Game-Based Backstepping Design for Strict-Feedback Nonlinear Multi-Agent Systems Based on Reinforcement Learning

摘要

访问文件

其它文件与链接

指纹

引用此