TY - JOUR
T1 - Imaginary filtered hindsight experience replay for UAV tracking dynamic targets in large-scale unknown environments
AU - HU, Zijian
AU - GAO, Xiaoguang
AU - WAN, Kaifang
AU - EVGENY, Neretin
AU - LI, Jinliang
N1 - Publisher Copyright:
© 2023 Chinese Society of Aeronautics and Astronautics
PY - 2023/5
Y1 - 2023/5
N2 - As an advanced combat weapon, Unmanned Aerial Vehicles (UAVs) have been widely used in military wars. In this paper, we formulated the Autonomous Navigation Control (ANC) problem of UAVs as a Markov Decision Process (MDP) and proposed a novel Deep Reinforcement Learning (DRL) method to allow UAVs to perform dynamic target tracking tasks in large-scale unknown environments. To solve the problem of limited training experience, the proposed Imaginary Filtered Hindsight Experience Replay (IFHER) generates successful episodes by reasonably imagining the target trajectory in the failed episode to augment the experiences. The well-designed goal, episode, and quality filtering strategies ensure that only high-quality augmented experiences can be stored, while the sampling filtering strategy of IFHER ensures that these stored augmented experiences can be fully learned according to their high priorities. By training in a complex environment constructed based on the parameters of a real UAV, the proposed IFHER algorithm improves the convergence speed by 28.99% and the convergence result by 11.57% compared to the state-of-the-art Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. The testing experiments carried out in environments with different complexities demonstrate the strong robustness and generalization ability of the IFHER agent. Moreover, the flight trajectory of the IFHER agent shows the superiority of the learned policy and the practical application value of the algorithm.
AB - As an advanced combat weapon, Unmanned Aerial Vehicles (UAVs) have been widely used in military wars. In this paper, we formulated the Autonomous Navigation Control (ANC) problem of UAVs as a Markov Decision Process (MDP) and proposed a novel Deep Reinforcement Learning (DRL) method to allow UAVs to perform dynamic target tracking tasks in large-scale unknown environments. To solve the problem of limited training experience, the proposed Imaginary Filtered Hindsight Experience Replay (IFHER) generates successful episodes by reasonably imagining the target trajectory in the failed episode to augment the experiences. The well-designed goal, episode, and quality filtering strategies ensure that only high-quality augmented experiences can be stored, while the sampling filtering strategy of IFHER ensures that these stored augmented experiences can be fully learned according to their high priorities. By training in a complex environment constructed based on the parameters of a real UAV, the proposed IFHER algorithm improves the convergence speed by 28.99% and the convergence result by 11.57% compared to the state-of-the-art Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. The testing experiments carried out in environments with different complexities demonstrate the strong robustness and generalization ability of the IFHER agent. Moreover, the flight trajectory of the IFHER agent shows the superiority of the learned policy and the practical application value of the algorithm.
KW - Artificial intelligence
KW - Autonomous navigation control
KW - Deep reinforcement learning
KW - Hindsight experience replay
KW - UAV
UR - http://www.scopus.com/inward/record.url?scp=85152417251&partnerID=8YFLogxK
U2 - 10.1016/j.cja.2022.09.008
DO - 10.1016/j.cja.2022.09.008
M3 - 文章
AN - SCOPUS:85152417251
SN - 1000-9361
VL - 36
SP - 377
EP - 391
JO - Chinese Journal of Aeronautics
JF - Chinese Journal of Aeronautics
IS - 5
ER -