Reinforcement Learning Based Solution to Two-player Zero-sum Game Using Differentiator

Xinxin Guo, Weisheng Yan, Peng Cui, Shouxu Zhang

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

In this paper, a synchronous adaptive learning algorithm based on reinforcement learning (RL) is proposed for the solution to twoplayer zero-sum games for partially-unknown systems. To approximate the unknown drift dynamics required to solve the Hamilton-Jacobi-Isaacs equation, one feasible method is to employ a first-order robust exact differentiator (RED) to obtain the estimations of the state derivatives, and then the estimation of the unknown drift dynamics can be obtained se- quentially with the known input disturbance dynamics. An actor- critic-disturbance neural network (NN) structure is established to approximate the optimal control policy, value function and disturbance policy, respectively. An online synchronous tuning algorithm is proposed for the three NNs applying the RL technique and the designed first-order RED. The proposed method can guarantee that the optimum can be reached in the worst case of disturbance and the closed-loop system can be stabilized by applying Lyapunov theorem. Finally, the effectiveness of the presented scheme is demonstrated by two linear and nonlinear simulation examples.

源语言英语
主期刊名ICARM 2018 - 2018 3rd International Conference on Advanced Robotics and Mechatronics
出版商Institute of Electrical and Electronics Engineers Inc.
708-713
页数6
ISBN(电子版)9781538670668
DOI
出版状态已出版 - 11 1月 2019
活动3rd IEEE International Conference on Advanced Robotics and Mechatronics, ICARM 2018 - Singapore, 新加坡
期限: 18 7月 201820 7月 2018

出版系列

姓名ICARM 2018 - 2018 3rd International Conference on Advanced Robotics and Mechatronics

会议

会议3rd IEEE International Conference on Advanced Robotics and Mechatronics, ICARM 2018
国家/地区新加坡
Singapore
时期18/07/1820/07/18

指纹

探究 'Reinforcement Learning Based Solution to Two-player Zero-sum Game Using Differentiator' 的科研主题。它们共同构成独一无二的指纹。

引用此