Hierarchical Reinforcement Learning for UAV-PE Game With Alternative Delay Update Method

Xiao Ma, Yuan Yuan, Lei Guo

科研成果: 期刊稿件文章同行评审

摘要

This article proposes a novel hierarchical reinforcement learning (HRL) algorithm for unmanned aerial vehicle pursuit-evasion (UAV-PE) game systems with an alternative delay update (ADU) method. In the proposed algorithm, the approximate solutions of the UAV-PE game problem are derived from a hierarchical learning process, which relies on a zero-sum game process of kinematics and a corresponding optimal process of dynamics. In this case, deep neural networks (NNs) are used to approximate the policy and value functions of UAV-PE game systems in kinematics and dynamics level. Furthermore, the ADU method is adopted to improve the training efficiency of deep NN by fixing one player of the UAV-PE game systems to form a stable environment. The goal of this article is to develop an HRL algorithm with an ADU method for obtaining approximate Nash equilibrium (NE) solutions of the considered UAV-PE game systems which are subjected to the coupling of kinematics and dynamics. Subsequently, sufficient conditions are provided for analyzing the convergence and optimality of the proposed HRL algorithm. Moreover, the inequalities of overload are obtained to guarantee that the state of dynamics tracks with the control input of kinematics in UAV-PE game systems. Finally, simulation examples are provided to demonstrate the feasibility and usefulness of the proposed HRL algorithm and ADU method.

源语言英语
页(从-至)1-13
页数13
期刊IEEE Transactions on Neural Networks and Learning Systems
DOI
出版状态已接受/待刊 - 2024

指纹

探究 'Hierarchical Reinforcement Learning for UAV-PE Game With Alternative Delay Update Method' 的科研主题。它们共同构成独一无二的指纹。

引用此