Maneuver and Attack Strategy Generation Method for Autonomous Air Combat in Hybrid Action Space Based on Proximal Policy Optimization

Yuhe Zhang, Zhen Yang, Shiyuan Chai, Yupeng He, Xingyu Wang, Deyun Zhou

科研成果: 书/报告/会议事项章节会议稿件同行评审

2 引用 (Scopus)

摘要

Reinforcement learning algorithm usually only improves maneuver strategy by the strength and weakness of the Air combat situation, but ignores the basic air combat attack task, whether the missile hits the target or not, and the hybrid action space problem caused by discrete missile launch strategy and continuous maneuver strategy. In order to solve the problem, this paper designs a reinforcement learning method based on proximal policy optimization, In this method, two separate policy networks are used to solve the hybrid action space problem caused by the discrete missile launch action and the continuous maneuver action. Whether the missile hits the target is taken as the evaluation system, and the missile launch action and maneuver action are jointly modeled. Thus complete the air combat task from the situation occupation through maneuvering action to the missile launch action guiding the missile to destroy the target. Finally, the intelligence level of the generation strategy is verified by the simulation experiment of UAV 1 versus 1 air combat attack mission under different initial situations. The results show that the maneuvering strategy and missile launching strategy generated by this algorithm are reasonable and can complete the designed air combat task.

源语言英语
主期刊名2023 42nd Chinese Control Conference, CCC 2023
出版商IEEE Computer Society
3946-3953
页数8
ISBN(电子版)9789887581543
DOI
出版状态已出版 - 2023
活动42nd Chinese Control Conference, CCC 2023 - Tianjin, 中国
期限: 24 7月 202326 7月 2023

出版系列

姓名Chinese Control Conference, CCC
2023-July
ISSN(印刷版)1934-1768
ISSN(电子版)2161-2927

会议

会议42nd Chinese Control Conference, CCC 2023
国家/地区中国
Tianjin
时期24/07/2326/07/23

指纹

探究 'Maneuver and Attack Strategy Generation Method for Autonomous Air Combat in Hybrid Action Space Based on Proximal Policy Optimization' 的科研主题。它们共同构成独一无二的指纹。

引用此