TY - JOUR
T1 - Relevant experience learning
T2 - A deep reinforcement learning method for UAV autonomous motion planning in complex unknown environments
AU - HU, Zijian
AU - GAO, Xiaoguang
AU - WAN, Kaifang
AU - ZHAI, Yiwei
AU - WANG, Qianglong
N1 - Publisher Copyright:
© 2021 Chinese Society of Aeronautics and Astronautics
PY - 2021/12
Y1 - 2021/12
N2 - Unmanned Aerial Vehicles (UAVs) play a vital role in military warfare. In a variety of battlefield mission scenarios, UAVs are required to safely fly to designated locations without human intervention. Therefore, finding a suitable method to solve the UAV Autonomous Motion Planning (AMP) problem can improve the success rate of UAV missions to a certain extent. In recent years, many studies have used Deep Reinforcement Learning (DRL) methods to address the AMP problem and have achieved good results. From the perspective of sampling, this paper designs a sampling method with double-screening, combines it with the Deep Deterministic Policy Gradient (DDPG) algorithm, and proposes the Relevant Experience Learning-DDPG (REL-DDPG) algorithm. The REL-DDPG algorithm uses a Prioritized Experience Replay (PER) mechanism to break the correlation of continuous experiences in the experience pool, finds the experiences most similar to the current state to learn according to the theory in human education, and expands the influence of the learning process on action selection at the current state. All experiments are applied in a complex unknown simulation environment constructed based on the parameters of a real UAV. The training experiments show that REL-DDPG improves the convergence speed and the convergence result compared to the state-of-the-art DDPG algorithm, while the testing experiments show the applicability of the algorithm and investigate the performance under different parameter conditions.
AB - Unmanned Aerial Vehicles (UAVs) play a vital role in military warfare. In a variety of battlefield mission scenarios, UAVs are required to safely fly to designated locations without human intervention. Therefore, finding a suitable method to solve the UAV Autonomous Motion Planning (AMP) problem can improve the success rate of UAV missions to a certain extent. In recent years, many studies have used Deep Reinforcement Learning (DRL) methods to address the AMP problem and have achieved good results. From the perspective of sampling, this paper designs a sampling method with double-screening, combines it with the Deep Deterministic Policy Gradient (DDPG) algorithm, and proposes the Relevant Experience Learning-DDPG (REL-DDPG) algorithm. The REL-DDPG algorithm uses a Prioritized Experience Replay (PER) mechanism to break the correlation of continuous experiences in the experience pool, finds the experiences most similar to the current state to learn according to the theory in human education, and expands the influence of the learning process on action selection at the current state. All experiments are applied in a complex unknown simulation environment constructed based on the parameters of a real UAV. The training experiments show that REL-DDPG improves the convergence speed and the convergence result compared to the state-of-the-art DDPG algorithm, while the testing experiments show the applicability of the algorithm and investigate the performance under different parameter conditions.
KW - Autonomous Motion Planning (AMP)
KW - Deep Deterministic Policy Gradient (DDPG)
KW - Deep Reinforcement Learning (DRL)
KW - Sampling method
KW - UAV
UR - http://www.scopus.com/inward/record.url?scp=85109444094&partnerID=8YFLogxK
U2 - 10.1016/j.cja.2020.12.027
DO - 10.1016/j.cja.2020.12.027
M3 - 文章
AN - SCOPUS:85109444094
SN - 1000-9361
VL - 34
SP - 187
EP - 204
JO - Chinese Journal of Aeronautics
JF - Chinese Journal of Aeronautics
IS - 12
ER -