Relevant experience learning: A deep reinforcement learning method for UAV autonomous motion planning in complex unknown environments

Zijian HU; Xiaoguang GAO; Kaifang WAN; Yiwei ZHAI; Qianglong WANG

doi:10.1016/j.cja.2020.12.027

Relevant experience learning: A deep reinforcement learning method for UAV autonomous motion planning in complex unknown environments

Zijian HU, Xiaoguang GAO, Kaifang WAN, Yiwei ZHAI, Qianglong WANG

电子信息学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

68 引用（Scopus）

摘要

Unmanned Aerial Vehicles (UAVs) play a vital role in military warfare. In a variety of battlefield mission scenarios, UAVs are required to safely fly to designated locations without human intervention. Therefore, finding a suitable method to solve the UAV Autonomous Motion Planning (AMP) problem can improve the success rate of UAV missions to a certain extent. In recent years, many studies have used Deep Reinforcement Learning (DRL) methods to address the AMP problem and have achieved good results. From the perspective of sampling, this paper designs a sampling method with double-screening, combines it with the Deep Deterministic Policy Gradient (DDPG) algorithm, and proposes the Relevant Experience Learning-DDPG (REL-DDPG) algorithm. The REL-DDPG algorithm uses a Prioritized Experience Replay (PER) mechanism to break the correlation of continuous experiences in the experience pool, finds the experiences most similar to the current state to learn according to the theory in human education, and expands the influence of the learning process on action selection at the current state. All experiments are applied in a complex unknown simulation environment constructed based on the parameters of a real UAV. The training experiments show that REL-DDPG improves the convergence speed and the convergence result compared to the state-of-the-art DDPG algorithm, while the testing experiments show the applicability of the algorithm and investigate the performance under different parameter conditions.

源语言	英语
页（从-至）	187-204
页数	18
期刊	Chinese Journal of Aeronautics
卷	34
期	12
DOI	https://doi.org/10.1016/j.cja.2020.12.027
出版状态	已出版 - 12月 2021

访问文件

10.1016/j.cja.2020.12.027

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{f541641fd1ae438a99d42c45184c52e2,

title = "Relevant experience learning: A deep reinforcement learning method for UAV autonomous motion planning in complex unknown environments",

abstract = "Unmanned Aerial Vehicles (UAVs) play a vital role in military warfare. In a variety of battlefield mission scenarios, UAVs are required to safely fly to designated locations without human intervention. Therefore, finding a suitable method to solve the UAV Autonomous Motion Planning (AMP) problem can improve the success rate of UAV missions to a certain extent. In recent years, many studies have used Deep Reinforcement Learning (DRL) methods to address the AMP problem and have achieved good results. From the perspective of sampling, this paper designs a sampling method with double-screening, combines it with the Deep Deterministic Policy Gradient (DDPG) algorithm, and proposes the Relevant Experience Learning-DDPG (REL-DDPG) algorithm. The REL-DDPG algorithm uses a Prioritized Experience Replay (PER) mechanism to break the correlation of continuous experiences in the experience pool, finds the experiences most similar to the current state to learn according to the theory in human education, and expands the influence of the learning process on action selection at the current state. All experiments are applied in a complex unknown simulation environment constructed based on the parameters of a real UAV. The training experiments show that REL-DDPG improves the convergence speed and the convergence result compared to the state-of-the-art DDPG algorithm, while the testing experiments show the applicability of the algorithm and investigate the performance under different parameter conditions.",

keywords = "Autonomous Motion Planning (AMP), Deep Deterministic Policy Gradient (DDPG), Deep Reinforcement Learning (DRL), Sampling method, UAV",

author = "Zijian HU and Xiaoguang GAO and Kaifang WAN and Yiwei ZHAI and Qianglong WANG",

note = "Publisher Copyright: {\textcopyright} 2021 Chinese Society of Aeronautics and Astronautics",

year = "2021",

month = dec,

doi = "10.1016/j.cja.2020.12.027",

language = "英语",

volume = "34",

pages = "187--204",

journal = "Chinese Journal of Aeronautics",

issn = "1000-9361",

publisher = "Elsevier B.V.",

number = "12",

}

TY - JOUR

T1 - Relevant experience learning

T2 - A deep reinforcement learning method for UAV autonomous motion planning in complex unknown environments

AU - HU, Zijian

AU - GAO, Xiaoguang

AU - WAN, Kaifang

AU - ZHAI, Yiwei

AU - WANG, Qianglong

PY - 2021/12

Y1 - 2021/12

N2 - Unmanned Aerial Vehicles (UAVs) play a vital role in military warfare. In a variety of battlefield mission scenarios, UAVs are required to safely fly to designated locations without human intervention. Therefore, finding a suitable method to solve the UAV Autonomous Motion Planning (AMP) problem can improve the success rate of UAV missions to a certain extent. In recent years, many studies have used Deep Reinforcement Learning (DRL) methods to address the AMP problem and have achieved good results. From the perspective of sampling, this paper designs a sampling method with double-screening, combines it with the Deep Deterministic Policy Gradient (DDPG) algorithm, and proposes the Relevant Experience Learning-DDPG (REL-DDPG) algorithm. The REL-DDPG algorithm uses a Prioritized Experience Replay (PER) mechanism to break the correlation of continuous experiences in the experience pool, finds the experiences most similar to the current state to learn according to the theory in human education, and expands the influence of the learning process on action selection at the current state. All experiments are applied in a complex unknown simulation environment constructed based on the parameters of a real UAV. The training experiments show that REL-DDPG improves the convergence speed and the convergence result compared to the state-of-the-art DDPG algorithm, while the testing experiments show the applicability of the algorithm and investigate the performance under different parameter conditions.

AB - Unmanned Aerial Vehicles (UAVs) play a vital role in military warfare. In a variety of battlefield mission scenarios, UAVs are required to safely fly to designated locations without human intervention. Therefore, finding a suitable method to solve the UAV Autonomous Motion Planning (AMP) problem can improve the success rate of UAV missions to a certain extent. In recent years, many studies have used Deep Reinforcement Learning (DRL) methods to address the AMP problem and have achieved good results. From the perspective of sampling, this paper designs a sampling method with double-screening, combines it with the Deep Deterministic Policy Gradient (DDPG) algorithm, and proposes the Relevant Experience Learning-DDPG (REL-DDPG) algorithm. The REL-DDPG algorithm uses a Prioritized Experience Replay (PER) mechanism to break the correlation of continuous experiences in the experience pool, finds the experiences most similar to the current state to learn according to the theory in human education, and expands the influence of the learning process on action selection at the current state. All experiments are applied in a complex unknown simulation environment constructed based on the parameters of a real UAV. The training experiments show that REL-DDPG improves the convergence speed and the convergence result compared to the state-of-the-art DDPG algorithm, while the testing experiments show the applicability of the algorithm and investigate the performance under different parameter conditions.

KW - Autonomous Motion Planning (AMP)

KW - Deep Deterministic Policy Gradient (DDPG)

KW - Deep Reinforcement Learning (DRL)

KW - Sampling method

KW - UAV

UR - http://www.scopus.com/inward/record.url?scp=85109444094&partnerID=8YFLogxK

U2 - 10.1016/j.cja.2020.12.027

DO - 10.1016/j.cja.2020.12.027

M3 - 文章

AN - SCOPUS:85109444094

SN - 1000-9361

VL - 34

SP - 187

EP - 204

JO - Chinese Journal of Aeronautics

JF - Chinese Journal of Aeronautics

IS - 12

ER -

Relevant experience learning: A deep reinforcement learning method for UAV autonomous motion planning in complex unknown environments

摘要

访问文件

其它文件与链接

指纹

引用此