TY - GEN
T1 - Dual-UAVs Maneuvering Strategy Generation Algorithm Based on Cooperative Reward Mechanism and MATD3
AU - Wang, Jiazhen
AU - Yang, Zhen
AU - Chai, Shiyuan
AU - Huo, Weiyu
AU - Zhou, Deyun
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In order to solve the cooperative maneuvering decision problem of UAVs in dual-UAVs formations in air combat, this paper proposes an air combat maneuvering algorithm based on a cooperative reward mechanism and a distributed Multi-Agent Twin Delayed Deep Deterministic Policy Gradient (MATD3). Firstly, the reward function is designed according to the combat purpose of dual-UAVs air combat. Secondly, To address the sparse reward function problem in air combat, a cooperative reward mechanism is introduced in the reward function based on the idea of cooperative combat in real air combat, and a variable weight superposition method based on the optimal combat distance is introduced in the calculation of immediate reward to reshape the reward function. The dual-UAVs formation confrontation simulation training is conducted under the framework of MATD3 algorithm. The simulation results show that the generated dual-UAVs cooperative air combat maneuver strategy is reasonable and more effective by introducing the collaborative reward mechanism and the combat distance influence factor.
AB - In order to solve the cooperative maneuvering decision problem of UAVs in dual-UAVs formations in air combat, this paper proposes an air combat maneuvering algorithm based on a cooperative reward mechanism and a distributed Multi-Agent Twin Delayed Deep Deterministic Policy Gradient (MATD3). Firstly, the reward function is designed according to the combat purpose of dual-UAVs air combat. Secondly, To address the sparse reward function problem in air combat, a cooperative reward mechanism is introduced in the reward function based on the idea of cooperative combat in real air combat, and a variable weight superposition method based on the optimal combat distance is introduced in the calculation of immediate reward to reshape the reward function. The dual-UAVs formation confrontation simulation training is conducted under the framework of MATD3 algorithm. The simulation results show that the generated dual-UAVs cooperative air combat maneuver strategy is reasonable and more effective by introducing the collaborative reward mechanism and the combat distance influence factor.
KW - air-combat maneuvering decisions
KW - collaborative reward mechanism
KW - dual-UAVs
KW - MATD3
UR - http://www.scopus.com/inward/record.url?scp=85183588174&partnerID=8YFLogxK
U2 - 10.1109/ICCMA59762.2023.10374675
DO - 10.1109/ICCMA59762.2023.10374675
M3 - 会议稿件
AN - SCOPUS:85183588174
T3 - 2023 11th International Conference on Control, Mechatronics and Automation, ICCMA 2023
SP - 86
EP - 91
BT - 2023 11th International Conference on Control, Mechatronics and Automation, ICCMA 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 11th International Conference on Control, Mechatronics and Automation, ICCMA 2023
Y2 - 1 November 2023 through 3 November 2023
ER -