TY - GEN
T1 - Maneuver and Attack Strategy Generation Method for Autonomous Air Combat in Hybrid Action Space Based on Proximal Policy Optimization
AU - Zhang, Yuhe
AU - Yang, Zhen
AU - Chai, Shiyuan
AU - He, Yupeng
AU - Wang, Xingyu
AU - Zhou, Deyun
N1 - Publisher Copyright:
© 2023 Technical Committee on Control Theory, Chinese Association of Automation.
PY - 2023
Y1 - 2023
N2 - Reinforcement learning algorithm usually only improves maneuver strategy by the strength and weakness of the Air combat situation, but ignores the basic air combat attack task, whether the missile hits the target or not, and the hybrid action space problem caused by discrete missile launch strategy and continuous maneuver strategy. In order to solve the problem, this paper designs a reinforcement learning method based on proximal policy optimization, In this method, two separate policy networks are used to solve the hybrid action space problem caused by the discrete missile launch action and the continuous maneuver action. Whether the missile hits the target is taken as the evaluation system, and the missile launch action and maneuver action are jointly modeled. Thus complete the air combat task from the situation occupation through maneuvering action to the missile launch action guiding the missile to destroy the target. Finally, the intelligence level of the generation strategy is verified by the simulation experiment of UAV 1 versus 1 air combat attack mission under different initial situations. The results show that the maneuvering strategy and missile launching strategy generated by this algorithm are reasonable and can complete the designed air combat task.
AB - Reinforcement learning algorithm usually only improves maneuver strategy by the strength and weakness of the Air combat situation, but ignores the basic air combat attack task, whether the missile hits the target or not, and the hybrid action space problem caused by discrete missile launch strategy and continuous maneuver strategy. In order to solve the problem, this paper designs a reinforcement learning method based on proximal policy optimization, In this method, two separate policy networks are used to solve the hybrid action space problem caused by the discrete missile launch action and the continuous maneuver action. Whether the missile hits the target is taken as the evaluation system, and the missile launch action and maneuver action are jointly modeled. Thus complete the air combat task from the situation occupation through maneuvering action to the missile launch action guiding the missile to destroy the target. Finally, the intelligence level of the generation strategy is verified by the simulation experiment of UAV 1 versus 1 air combat attack mission under different initial situations. The results show that the maneuvering strategy and missile launching strategy generated by this algorithm are reasonable and can complete the designed air combat task.
KW - Air Combat
KW - Hybrid Action Space
KW - Missile Launch Strategy
KW - Proximal Policy Optimization
KW - Reinforcement Learning
UR - http://www.scopus.com/inward/record.url?scp=85175562790&partnerID=8YFLogxK
U2 - 10.23919/CCC58697.2023.10240246
DO - 10.23919/CCC58697.2023.10240246
M3 - 会议稿件
AN - SCOPUS:85175562790
T3 - Chinese Control Conference, CCC
SP - 3946
EP - 3953
BT - 2023 42nd Chinese Control Conference, CCC 2023
PB - IEEE Computer Society
T2 - 42nd Chinese Control Conference, CCC 2023
Y2 - 24 July 2023 through 26 July 2023
ER -