TY - JOUR
T1 - Autonomous maneuver decision-making for a UCAV in short-range aerial combat based on an MS-DDQN algorithm
AU - Li, Yong feng
AU - Shi, Jing ping
AU - Jiang, Wei
AU - Zhang, Wei guo
AU - Lyu, Yong xi
N1 - Publisher Copyright:
© 2021 China Ordnance Society
PY - 2022/9
Y1 - 2022/9
N2 - To solve the problem of realizing autonomous aerial combat decision-making for unmanned combat aerial vehicles (UCAVs) rapidly and accurately in an uncertain environment, this paper proposes a decision-making method based on an improved deep reinforcement learning (DRL) algorithm: the multi-step double deep Q-network (MS-DDQN) algorithm. First, a six-degree-of-freedom UCAV model based on an aircraft control system is established on a simulation platform, and the situation assessment functions of the UCAV and its target are established by considering their angles, altitudes, environments, missile attack performances, and UCAV performance. By controlling the flight path angle, roll angle, and flight velocity, 27 common basic actions are designed. On this basis, aiming to overcome the defects of traditional DRL in terms of training speed and convergence speed, the improved MS-DDQN method is introduced to incorporate the final return value into the previous steps. Finally, the pre-training learning model is used as the starting point for the second learning model to simulate the UCAV aerial combat decision-making process based on the basic training method, which helps to shorten the training time and improve the learning efficiency. The improved DRL algorithm significantly accelerates the training speed and estimates the target value more accurately during training, and it can be applied to aerial combat decision-making.
AB - To solve the problem of realizing autonomous aerial combat decision-making for unmanned combat aerial vehicles (UCAVs) rapidly and accurately in an uncertain environment, this paper proposes a decision-making method based on an improved deep reinforcement learning (DRL) algorithm: the multi-step double deep Q-network (MS-DDQN) algorithm. First, a six-degree-of-freedom UCAV model based on an aircraft control system is established on a simulation platform, and the situation assessment functions of the UCAV and its target are established by considering their angles, altitudes, environments, missile attack performances, and UCAV performance. By controlling the flight path angle, roll angle, and flight velocity, 27 common basic actions are designed. On this basis, aiming to overcome the defects of traditional DRL in terms of training speed and convergence speed, the improved MS-DDQN method is introduced to incorporate the final return value into the previous steps. Finally, the pre-training learning model is used as the starting point for the second learning model to simulate the UCAV aerial combat decision-making process based on the basic training method, which helps to shorten the training time and improve the learning efficiency. The improved DRL algorithm significantly accelerates the training speed and estimates the target value more accurately during training, and it can be applied to aerial combat decision-making.
KW - Aerial combat decision
KW - Aerial combat maneuver library
KW - Multi-step double deep Q-network
KW - Six-degree-of-freedom
KW - Unmanned combat aerial vehicle
UR - http://www.scopus.com/inward/record.url?scp=85122668367&partnerID=8YFLogxK
U2 - 10.1016/j.dt.2021.09.014
DO - 10.1016/j.dt.2021.09.014
M3 - 文章
AN - SCOPUS:85122668367
SN - 2096-3459
VL - 18
SP - 1697
EP - 1714
JO - Defence Technology
JF - Defence Technology
IS - 9
ER -