Imaginary filtered hindsight experience replay for UAV tracking dynamic targets in large-scale unknown environments

Zijian HU; Xiaoguang GAO; Kaifang WAN; Neretin EVGENY; Jinliang LI

doi:10.1016/j.cja.2022.09.008

Imaginary filtered hindsight experience replay for UAV tracking dynamic targets in large-scale unknown environments

Zijian HU, Xiaoguang GAO, Kaifang WAN, Neretin EVGENY, Jinliang LI

电子信息学院

科研成果: 期刊稿件 › 文章 › 同行评审

18 引用（Scopus）

摘要

As an advanced combat weapon, Unmanned Aerial Vehicles (UAVs) have been widely used in military wars. In this paper, we formulated the Autonomous Navigation Control (ANC) problem of UAVs as a Markov Decision Process (MDP) and proposed a novel Deep Reinforcement Learning (DRL) method to allow UAVs to perform dynamic target tracking tasks in large-scale unknown environments. To solve the problem of limited training experience, the proposed Imaginary Filtered Hindsight Experience Replay (IFHER) generates successful episodes by reasonably imagining the target trajectory in the failed episode to augment the experiences. The well-designed goal, episode, and quality filtering strategies ensure that only high-quality augmented experiences can be stored, while the sampling filtering strategy of IFHER ensures that these stored augmented experiences can be fully learned according to their high priorities. By training in a complex environment constructed based on the parameters of a real UAV, the proposed IFHER algorithm improves the convergence speed by 28.99% and the convergence result by 11.57% compared to the state-of-the-art Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. The testing experiments carried out in environments with different complexities demonstrate the strong robustness and generalization ability of the IFHER agent. Moreover, the flight trajectory of the IFHER agent shows the superiority of the learned policy and the practical application value of the algorithm.

源语言	英语
页（从-至）	377-391
页数	15
期刊	Chinese Journal of Aeronautics
卷	36
期	5
DOI	https://doi.org/10.1016/j.cja.2022.09.008
出版状态	已出版 - 5月 2023

访问文件

10.1016/j.cja.2022.09.008

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{bb52000f9d5b4aa7825f8e8b44428157,

title = "Imaginary filtered hindsight experience replay for UAV tracking dynamic targets in large-scale unknown environments",

abstract = "As an advanced combat weapon, Unmanned Aerial Vehicles (UAVs) have been widely used in military wars. In this paper, we formulated the Autonomous Navigation Control (ANC) problem of UAVs as a Markov Decision Process (MDP) and proposed a novel Deep Reinforcement Learning (DRL) method to allow UAVs to perform dynamic target tracking tasks in large-scale unknown environments. To solve the problem of limited training experience, the proposed Imaginary Filtered Hindsight Experience Replay (IFHER) generates successful episodes by reasonably imagining the target trajectory in the failed episode to augment the experiences. The well-designed goal, episode, and quality filtering strategies ensure that only high-quality augmented experiences can be stored, while the sampling filtering strategy of IFHER ensures that these stored augmented experiences can be fully learned according to their high priorities. By training in a complex environment constructed based on the parameters of a real UAV, the proposed IFHER algorithm improves the convergence speed by 28.99% and the convergence result by 11.57% compared to the state-of-the-art Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. The testing experiments carried out in environments with different complexities demonstrate the strong robustness and generalization ability of the IFHER agent. Moreover, the flight trajectory of the IFHER agent shows the superiority of the learned policy and the practical application value of the algorithm.",

keywords = "Artificial intelligence, Autonomous navigation control, Deep reinforcement learning, Hindsight experience replay, UAV",

author = "Zijian HU and Xiaoguang GAO and Kaifang WAN and Neretin EVGENY and Jinliang LI",

note = "Publisher Copyright: {\textcopyright} 2023 Chinese Society of Aeronautics and Astronautics",

year = "2023",

month = may,

doi = "10.1016/j.cja.2022.09.008",

language = "英语",

volume = "36",

pages = "377--391",

journal = "Chinese Journal of Aeronautics",

issn = "1000-9361",

publisher = "Elsevier B.V.",

number = "5",

}

TY - JOUR

T1 - Imaginary filtered hindsight experience replay for UAV tracking dynamic targets in large-scale unknown environments

AU - HU, Zijian

AU - GAO, Xiaoguang

AU - WAN, Kaifang

AU - EVGENY, Neretin

AU - LI, Jinliang

PY - 2023/5

Y1 - 2023/5

N2 - As an advanced combat weapon, Unmanned Aerial Vehicles (UAVs) have been widely used in military wars. In this paper, we formulated the Autonomous Navigation Control (ANC) problem of UAVs as a Markov Decision Process (MDP) and proposed a novel Deep Reinforcement Learning (DRL) method to allow UAVs to perform dynamic target tracking tasks in large-scale unknown environments. To solve the problem of limited training experience, the proposed Imaginary Filtered Hindsight Experience Replay (IFHER) generates successful episodes by reasonably imagining the target trajectory in the failed episode to augment the experiences. The well-designed goal, episode, and quality filtering strategies ensure that only high-quality augmented experiences can be stored, while the sampling filtering strategy of IFHER ensures that these stored augmented experiences can be fully learned according to their high priorities. By training in a complex environment constructed based on the parameters of a real UAV, the proposed IFHER algorithm improves the convergence speed by 28.99% and the convergence result by 11.57% compared to the state-of-the-art Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. The testing experiments carried out in environments with different complexities demonstrate the strong robustness and generalization ability of the IFHER agent. Moreover, the flight trajectory of the IFHER agent shows the superiority of the learned policy and the practical application value of the algorithm.

AB - As an advanced combat weapon, Unmanned Aerial Vehicles (UAVs) have been widely used in military wars. In this paper, we formulated the Autonomous Navigation Control (ANC) problem of UAVs as a Markov Decision Process (MDP) and proposed a novel Deep Reinforcement Learning (DRL) method to allow UAVs to perform dynamic target tracking tasks in large-scale unknown environments. To solve the problem of limited training experience, the proposed Imaginary Filtered Hindsight Experience Replay (IFHER) generates successful episodes by reasonably imagining the target trajectory in the failed episode to augment the experiences. The well-designed goal, episode, and quality filtering strategies ensure that only high-quality augmented experiences can be stored, while the sampling filtering strategy of IFHER ensures that these stored augmented experiences can be fully learned according to their high priorities. By training in a complex environment constructed based on the parameters of a real UAV, the proposed IFHER algorithm improves the convergence speed by 28.99% and the convergence result by 11.57% compared to the state-of-the-art Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. The testing experiments carried out in environments with different complexities demonstrate the strong robustness and generalization ability of the IFHER agent. Moreover, the flight trajectory of the IFHER agent shows the superiority of the learned policy and the practical application value of the algorithm.

KW - Artificial intelligence

KW - Autonomous navigation control

KW - Deep reinforcement learning

KW - Hindsight experience replay

KW - UAV

UR - http://www.scopus.com/inward/record.url?scp=85152417251&partnerID=8YFLogxK

U2 - 10.1016/j.cja.2022.09.008

DO - 10.1016/j.cja.2022.09.008

M3 - 文章

AN - SCOPUS:85152417251

SN - 1000-9361

VL - 36

SP - 377

EP - 391

JO - Chinese Journal of Aeronautics

JF - Chinese Journal of Aeronautics

IS - 5

ER -

Imaginary filtered hindsight experience replay for UAV tracking dynamic targets in large-scale unknown environments

摘要

访问文件

其它文件与链接

指纹

引用此