Skip to main navigation Skip to search Skip to main content

Reinforcement Learning Based Solution to Two-player Zero-sum Game Using Differentiator

  • Northwestern Polytechnical University Xian

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, a synchronous adaptive learning algorithm based on reinforcement learning (RL) is proposed for the solution to twoplayer zero-sum games for partially-unknown systems. To approximate the unknown drift dynamics required to solve the Hamilton-Jacobi-Isaacs equation, one feasible method is to employ a first-order robust exact differentiator (RED) to obtain the estimations of the state derivatives, and then the estimation of the unknown drift dynamics can be obtained se- quentially with the known input disturbance dynamics. An actor- critic-disturbance neural network (NN) structure is established to approximate the optimal control policy, value function and disturbance policy, respectively. An online synchronous tuning algorithm is proposed for the three NNs applying the RL technique and the designed first-order RED. The proposed method can guarantee that the optimum can be reached in the worst case of disturbance and the closed-loop system can be stabilized by applying Lyapunov theorem. Finally, the effectiveness of the presented scheme is demonstrated by two linear and nonlinear simulation examples.

Original languageEnglish
Title of host publicationICARM 2018 - 2018 3rd International Conference on Advanced Robotics and Mechatronics
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages708-713
Number of pages6
ISBN (Electronic)9781538670668
DOIs
StatePublished - 11 Jan 2019
Event3rd IEEE International Conference on Advanced Robotics and Mechatronics, ICARM 2018 - Singapore, Singapore
Duration: 18 Jul 201820 Jul 2018

Publication series

NameICARM 2018 - 2018 3rd International Conference on Advanced Robotics and Mechatronics

Conference

Conference3rd IEEE International Conference on Advanced Robotics and Mechatronics, ICARM 2018
Country/TerritorySingapore
CitySingapore
Period18/07/1820/07/18

Keywords

  • First-order robust exact differentiator
  • neural network
  • reinforcement learning
  • two-player zero-sum game

Fingerprint

Dive into the research topics of 'Reinforcement Learning Based Solution to Two-player Zero-sum Game Using Differentiator'. Together they form a unique fingerprint.

Cite this