TY - GEN
T1 - Deep Reinforcement Learning-based Behaviour Generation Algorithm for Air Combat Escape Intention
AU - Wang, Xingyu
AU - Yang, Zhen
AU - Li, Xiaoyang
AU - Chai, Shiyuan
AU - He, Yupeng
AU - Zhou, Deyun
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Although deep reinforcement learning applied to air combat has achieved good results, it still faces a series of challenges such as reward design, convergence of suboptimal solutions, and poor stability. In this regard, this paper proposes a behaviour generation algorithm based on Dueling-Noisy-Multi-step DQN for air combat under escape intent. By analysing the air combat confrontation process, we extract the escape intention features and establish the corresponding reward model; for the problem of poor stability and slow convergence of deep reinforcement learning algorithms in large-scale state-action space, we propose the Dueling-Noisy-Multi-step DQN algorithm, which improves the accuracy of the value function fitting and at the same time increases the efficiency of spatial exploration and network generalization. Comparison with other algorithms through simulation experiments, the results reflect the excellent performance of this paper's algorithm.
AB - Although deep reinforcement learning applied to air combat has achieved good results, it still faces a series of challenges such as reward design, convergence of suboptimal solutions, and poor stability. In this regard, this paper proposes a behaviour generation algorithm based on Dueling-Noisy-Multi-step DQN for air combat under escape intent. By analysing the air combat confrontation process, we extract the escape intention features and establish the corresponding reward model; for the problem of poor stability and slow convergence of deep reinforcement learning algorithms in large-scale state-action space, we propose the Dueling-Noisy-Multi-step DQN algorithm, which improves the accuracy of the value function fitting and at the same time increases the efficiency of spatial exploration and network generalization. Comparison with other algorithms through simulation experiments, the results reflect the excellent performance of this paper's algorithm.
UR - http://www.scopus.com/inward/record.url?scp=85200390545&partnerID=8YFLogxK
U2 - 10.1109/ICCA62789.2024.10591840
DO - 10.1109/ICCA62789.2024.10591840
M3 - 会议稿件
AN - SCOPUS:85200390545
T3 - IEEE International Conference on Control and Automation, ICCA
SP - 228
EP - 233
BT - 2024 IEEE 18th International Conference on Control and Automation, ICCA 2024
PB - IEEE Computer Society
T2 - 18th IEEE International Conference on Control and Automation, ICCA 2024
Y2 - 18 June 2024 through 21 June 2024
ER -