TY - JOUR
T1 - 深度强化学习的无人作战飞机空战机动决策
AU - Li, Yongfeng
AU - Shi, Jingping
AU - Zhang, Weiguo
AU - Jiang, Wei
N1 - Publisher Copyright:
© 2021, Editorial Board of Journal of Harbin Institute of Technology. All right reserved.
PY - 2021/12/30
Y1 - 2021/12/30
N2 - When an unmanned combat aerial vehicle (UCAV) is making the decision of autonomous maneuver in air combat, it faces large-scale calculation and is susceptible to the uncertain manipulation of the enemy. To tackle such problems, a decision-making model for autonomous maneuver of UCAV in air combat was proposed based on deep reinforcement learning algorithm in this study. With this algorithm, the UCAV can autonomously make maneuver decisions during air combat to achieve dominant position. First, based on the aircraft control system, a six-degree-of-freedom UCAV model was built using MATLAB/Simulink simulation platform, and the appropriate air combat action was selected as the maneuver output. On this basis, the decision-making model for the autonomous maneuver of UCAV in air combat was designed. Through the relative movement of both sides, the operational evaluation model was constructed. The range of the missile attack area was analyzed, and the corresponding advantage function was taken as the evaluation basis of the deep reinforcement learning. Then, the UCAV was trained by stages from the easy to the difficult, and the optimal maneuver control command was analyzed by investigating the deep Q network. Thereby, the UCAV could select corresponding maneuver actions in different situations and evaluate the battlefield situation independently, making tactical decisions and achieving the purpose of improving combat effectiveness. Simulation results suggest that the proposed method can make UCAV choose the tactical action independently in air combat and reach the dominant position quickly, which greatly improves the combat efficiency of the UCAV.
AB - When an unmanned combat aerial vehicle (UCAV) is making the decision of autonomous maneuver in air combat, it faces large-scale calculation and is susceptible to the uncertain manipulation of the enemy. To tackle such problems, a decision-making model for autonomous maneuver of UCAV in air combat was proposed based on deep reinforcement learning algorithm in this study. With this algorithm, the UCAV can autonomously make maneuver decisions during air combat to achieve dominant position. First, based on the aircraft control system, a six-degree-of-freedom UCAV model was built using MATLAB/Simulink simulation platform, and the appropriate air combat action was selected as the maneuver output. On this basis, the decision-making model for the autonomous maneuver of UCAV in air combat was designed. Through the relative movement of both sides, the operational evaluation model was constructed. The range of the missile attack area was analyzed, and the corresponding advantage function was taken as the evaluation basis of the deep reinforcement learning. Then, the UCAV was trained by stages from the easy to the difficult, and the optimal maneuver control command was analyzed by investigating the deep Q network. Thereby, the UCAV could select corresponding maneuver actions in different situations and evaluate the battlefield situation independently, making tactical decisions and achieving the purpose of improving combat effectiveness. Simulation results suggest that the proposed method can make UCAV choose the tactical action independently in air combat and reach the dominant position quickly, which greatly improves the combat efficiency of the UCAV.
KW - Advantage function
KW - Autonomous maneuver decision in air combat
KW - Deep Q network
KW - Deep reinforcement learning
KW - Six-degree-of-freedom
KW - Unmanned combat aerial vehicle (UCAV)
UR - http://www.scopus.com/inward/record.url?scp=85121120998&partnerID=8YFLogxK
U2 - 10.11918/202005108
DO - 10.11918/202005108
M3 - 文章
AN - SCOPUS:85121120998
SN - 0367-6234
VL - 53
SP - 33
EP - 41
JO - Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology
JF - Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology
IS - 12
ER -