A Multi-UCAV Cooperative Decision-Making Method Based on an MAPPO Algorithm for Beyond-Visual-Range Air Combat

Xiaoxiong Liu; Yi Yin; Yuzhan Su; Ruichen Ming

doi:10.3390/aerospace9100563

A Multi-UCAV Cooperative Decision-Making Method Based on an MAPPO Algorithm for Beyond-Visual-Range Air Combat

Xiaoxiong Liu, Yi Yin, Yuzhan Su, Ruichen Ming

自动化学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

26 引用（Scopus）

摘要

To solve the problems of autonomous decision making and the cooperative operation of multiple unmanned combat aerial vehicles (UCAVs) in beyond-visual-range air combat, this paper proposes an air combat decision-making method that is based on a multi-agent proximal policy optimization (MAPPO) algorithm. Firstly, the model of the unmanned combat aircraft is established on the simulation platform, and the corresponding maneuver library is designed. In order to simulate the real beyond-visual-range air combat, the missile attack area model is established, and the probability of damage occurring is given according to both the enemy and us. Secondly, to overcome the sparse return problem of traditional reinforcement learning, according to the angle, speed, altitude, distance of the unmanned combat aircraft, and the damage of the missile attack area, this paper designs a comprehensive reward function. Finally, the idea of centralized training and distributed implementation is adopted to improve the decision-making ability of the unmanned combat aircraft and improve the training efficiency of the algorithm. The simulation results show that this algorithm can carry out a multi-aircraft air combat confrontation drill, form new tactical decisions in the drill process, and provide new ideas for multi-UCAV air combat.

源语言	英语
文章编号	563
期刊	Aerospace
卷	9
期	10
DOI	https://doi.org/10.3390/aerospace9100563
出版状态	已出版 - 10月 2022

访问文件

10.3390/aerospace9100563

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{fde01c40c7b94fc4bac55c111f3aeed9,

title = "A Multi-UCAV Cooperative Decision-Making Method Based on an MAPPO Algorithm for Beyond-Visual-Range Air Combat",

abstract = "To solve the problems of autonomous decision making and the cooperative operation of multiple unmanned combat aerial vehicles (UCAVs) in beyond-visual-range air combat, this paper proposes an air combat decision-making method that is based on a multi-agent proximal policy optimization (MAPPO) algorithm. Firstly, the model of the unmanned combat aircraft is established on the simulation platform, and the corresponding maneuver library is designed. In order to simulate the real beyond-visual-range air combat, the missile attack area model is established, and the probability of damage occurring is given according to both the enemy and us. Secondly, to overcome the sparse return problem of traditional reinforcement learning, according to the angle, speed, altitude, distance of the unmanned combat aircraft, and the damage of the missile attack area, this paper designs a comprehensive reward function. Finally, the idea of centralized training and distributed implementation is adopted to improve the decision-making ability of the unmanned combat aircraft and improve the training efficiency of the algorithm. The simulation results show that this algorithm can carry out a multi-aircraft air combat confrontation drill, form new tactical decisions in the drill process, and provide new ideas for multi-UCAV air combat.",

keywords = "centralized training and distributed execution, comprehensive reward, multi-agent proximal policy optimization, multiple unmanned combat aerial vehicles, the missile attack area model",

author = "Xiaoxiong Liu and Yi Yin and Yuzhan Su and Ruichen Ming",

note = "Publisher Copyright: {\textcopyright} 2022 by the authors.",

year = "2022",

month = oct,

doi = "10.3390/aerospace9100563",

language = "英语",

volume = "9",

journal = "Aerospace",

issn = "2226-4310",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "10",

}

TY - JOUR

T1 - A Multi-UCAV Cooperative Decision-Making Method Based on an MAPPO Algorithm for Beyond-Visual-Range Air Combat

AU - Liu, Xiaoxiong

AU - Yin, Yi

AU - Su, Yuzhan

AU - Ming, Ruichen

PY - 2022/10

Y1 - 2022/10

N2 - To solve the problems of autonomous decision making and the cooperative operation of multiple unmanned combat aerial vehicles (UCAVs) in beyond-visual-range air combat, this paper proposes an air combat decision-making method that is based on a multi-agent proximal policy optimization (MAPPO) algorithm. Firstly, the model of the unmanned combat aircraft is established on the simulation platform, and the corresponding maneuver library is designed. In order to simulate the real beyond-visual-range air combat, the missile attack area model is established, and the probability of damage occurring is given according to both the enemy and us. Secondly, to overcome the sparse return problem of traditional reinforcement learning, according to the angle, speed, altitude, distance of the unmanned combat aircraft, and the damage of the missile attack area, this paper designs a comprehensive reward function. Finally, the idea of centralized training and distributed implementation is adopted to improve the decision-making ability of the unmanned combat aircraft and improve the training efficiency of the algorithm. The simulation results show that this algorithm can carry out a multi-aircraft air combat confrontation drill, form new tactical decisions in the drill process, and provide new ideas for multi-UCAV air combat.

AB - To solve the problems of autonomous decision making and the cooperative operation of multiple unmanned combat aerial vehicles (UCAVs) in beyond-visual-range air combat, this paper proposes an air combat decision-making method that is based on a multi-agent proximal policy optimization (MAPPO) algorithm. Firstly, the model of the unmanned combat aircraft is established on the simulation platform, and the corresponding maneuver library is designed. In order to simulate the real beyond-visual-range air combat, the missile attack area model is established, and the probability of damage occurring is given according to both the enemy and us. Secondly, to overcome the sparse return problem of traditional reinforcement learning, according to the angle, speed, altitude, distance of the unmanned combat aircraft, and the damage of the missile attack area, this paper designs a comprehensive reward function. Finally, the idea of centralized training and distributed implementation is adopted to improve the decision-making ability of the unmanned combat aircraft and improve the training efficiency of the algorithm. The simulation results show that this algorithm can carry out a multi-aircraft air combat confrontation drill, form new tactical decisions in the drill process, and provide new ideas for multi-UCAV air combat.

KW - centralized training and distributed execution

KW - comprehensive reward

KW - multi-agent proximal policy optimization

KW - multiple unmanned combat aerial vehicles

KW - the missile attack area model

UR - http://www.scopus.com/inward/record.url?scp=85140397644&partnerID=8YFLogxK

U2 - 10.3390/aerospace9100563

DO - 10.3390/aerospace9100563

M3 - 文章

AN - SCOPUS:85140397644

SN - 2226-4310

VL - 9

JO - Aerospace

JF - Aerospace

IS - 10

M1 - 563

ER -

A Multi-UCAV Cooperative Decision-Making Method Based on an MAPPO Algorithm for Beyond-Visual-Range Air Combat

摘要

访问文件

其它文件与链接

指纹

引用此