TY - GEN
T1 - Air combat autonomous maneuver decision for one-on-one within visual range engagement base on robust multi-agent reinforcement learning
AU - Kong, Weiren
AU - Zhou, Deyun
AU - Zhang, Kai
AU - Yang, Zhen
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/10/9
Y1 - 2020/10/9
N2 - Based on a robust multi-agent reinforcement learning (MARL) algorithm framework, an autonomous maneuver decision-making algorithm for UCAV air combat in one-on-one combat in the visible range is designed and implemented. This algorithm can solve the problem that the single agent reinforcement learning algorithm cannot converge during the training process due to the unstable environment. At the same time, considering the shortcomings of the MADDPG algorithm in a strong competitive environment, it is easy to obtain a very fragile strategy, which is only targeted at a specific equilibrium strategy. In this paper, a minimax module is introduced to obtain the expected perturbation, which can locally approach the worst-case perturbation through the gradient. Through simulation tests of algorithm convergence and policy quality, the algorithm is found to be effective.
AB - Based on a robust multi-agent reinforcement learning (MARL) algorithm framework, an autonomous maneuver decision-making algorithm for UCAV air combat in one-on-one combat in the visible range is designed and implemented. This algorithm can solve the problem that the single agent reinforcement learning algorithm cannot converge during the training process due to the unstable environment. At the same time, considering the shortcomings of the MADDPG algorithm in a strong competitive environment, it is easy to obtain a very fragile strategy, which is only targeted at a specific equilibrium strategy. In this paper, a minimax module is introduced to obtain the expected perturbation, which can locally approach the worst-case perturbation through the gradient. Through simulation tests of algorithm convergence and policy quality, the algorithm is found to be effective.
KW - Air combat
KW - Maneuver strategy
KW - Reinforcement learning
KW - Robust MADDPG
UR - http://www.scopus.com/inward/record.url?scp=85098057782&partnerID=8YFLogxK
U2 - 10.1109/ICCA51439.2020.9264567
DO - 10.1109/ICCA51439.2020.9264567
M3 - 会议稿件
AN - SCOPUS:85098057782
T3 - IEEE International Conference on Control and Automation, ICCA
SP - 506
EP - 512
BT - 2020 IEEE 16th International Conference on Control and Automation, ICCA 2020
PB - IEEE Computer Society
T2 - 16th IEEE International Conference on Control and Automation, ICCA 2020
Y2 - 9 October 2020 through 11 October 2020
ER -