基于深度强化学习与自学习的多无人机近距空战机动策略生成算法

Wei Ren Kong; De Yun Zhou; Yi Yang Zhao; Wan Sha Yang

doi:10.7641/CTA.2021.10120

基于深度强化学习与自学习的多无人机近距空战机动策略生成算法

Translated title of the contribution: Maneuvering strategy generation algorithm for multi-UAV in close-range air combat based on deep reinforcement learning and self-play

Wei Ren Kong, De Yun Zhou, Yi Yang Zhao, Wan Sha Yang

School of Electronics and Information

Research output: Contribution to journal › Article › peer-review

14 Scopus citations

Abstract

In order to solve the problem of multi-UAV close-range air combat maneuvering decision-making, a multi- UAV close-range air combat maneuvering strategy generation algorithm based on parameter sharing Q network and neural fictitious self-play is proposed. Firstly, a hybrid Markov game model suitable for different UAV formation sizes and a reinforcement learning framework for generating maneuvering decision strategies of multi-UAV are designed-parameter sharing Q network, and the state space is compressed through the autoencoder to improve the efficiency of strategy learning. Then, using the neural fictitious self-play makes the maneuver strategy converge to the Nash equilibrium strategy. Finally, simulation experiments are carried out on the parameter selection of the autoencoder, the training process of the strategy generation algorithm, and the rationality and portability of the maneuver strategy. The simulation results show that the autoencoder is introduced can effectively improve the efficiency of strategy learning, and the multi-UAV short-range air combat maneuver strategy generated by this algorithm is reasonable and good portability.

Translated title of the contribution	Maneuvering strategy generation algorithm for multi-UAV in close-range air combat based on deep reinforcement learning and self-play
Original language	Chinese (Traditional)
Pages (from-to)	352-362
Number of pages	11
Journal	Kongzhi Lilun Yu Yingyong/Control Theory and Applications
Volume	39
Issue number	2
DOIs	https://doi.org/10.7641/CTA.2021.10120
State	Published - Feb 2022

Access to Document

10.7641/CTA.2021.10120

Cite this

@article{6f120c9930ec4933847534f2f0cb2b2c,

title = "基于深度强化学习与自学习的多无人机近距空战机动策略生成算法",

abstract = "In order to solve the problem of multi-UAV close-range air combat maneuvering decision-making, a multi- UAV close-range air combat maneuvering strategy generation algorithm based on parameter sharing Q network and neural fictitious self-play is proposed. Firstly, a hybrid Markov game model suitable for different UAV formation sizes and a reinforcement learning framework for generating maneuvering decision strategies of multi-UAV are designed-parameter sharing Q network, and the state space is compressed through the autoencoder to improve the efficiency of strategy learning. Then, using the neural fictitious self-play makes the maneuver strategy converge to the Nash equilibrium strategy. Finally, simulation experiments are carried out on the parameter selection of the autoencoder, the training process of the strategy generation algorithm, and the rationality and portability of the maneuver strategy. The simulation results show that the autoencoder is introduced can effectively improve the efficiency of strategy learning, and the multi-UAV short-range air combat maneuver strategy generated by this algorithm is reasonable and good portability.",

keywords = "Air combat decision-making, Fictitious self-play, Multi-UAV cooperation, Reinforcement learning",

author = "Kong, {Wei Ren} and Zhou, {De Yun} and Zhao, {Yi Yang} and Yang, {Wan Sha}",

year = "2022",

month = feb,

doi = "10.7641/CTA.2021.10120",

language = "繁体中文",

volume = "39",

pages = "352--362",

journal = "Kongzhi Lilun Yu Yingyong/Control Theory and Applications",

issn = "1000-8152",

publisher = "South China University of Technology",

number = "2",

}

TY - JOUR

T1 - 基于深度强化学习与自学习的多无人机近距空战机动策略生成算法

AU - Kong, Wei Ren

AU - Zhou, De Yun

AU - Zhao, Yi Yang

AU - Yang, Wan Sha

PY - 2022/2

Y1 - 2022/2

N2 - In order to solve the problem of multi-UAV close-range air combat maneuvering decision-making, a multi- UAV close-range air combat maneuvering strategy generation algorithm based on parameter sharing Q network and neural fictitious self-play is proposed. Firstly, a hybrid Markov game model suitable for different UAV formation sizes and a reinforcement learning framework for generating maneuvering decision strategies of multi-UAV are designed-parameter sharing Q network, and the state space is compressed through the autoencoder to improve the efficiency of strategy learning. Then, using the neural fictitious self-play makes the maneuver strategy converge to the Nash equilibrium strategy. Finally, simulation experiments are carried out on the parameter selection of the autoencoder, the training process of the strategy generation algorithm, and the rationality and portability of the maneuver strategy. The simulation results show that the autoencoder is introduced can effectively improve the efficiency of strategy learning, and the multi-UAV short-range air combat maneuver strategy generated by this algorithm is reasonable and good portability.

AB - In order to solve the problem of multi-UAV close-range air combat maneuvering decision-making, a multi- UAV close-range air combat maneuvering strategy generation algorithm based on parameter sharing Q network and neural fictitious self-play is proposed. Firstly, a hybrid Markov game model suitable for different UAV formation sizes and a reinforcement learning framework for generating maneuvering decision strategies of multi-UAV are designed-parameter sharing Q network, and the state space is compressed through the autoencoder to improve the efficiency of strategy learning. Then, using the neural fictitious self-play makes the maneuver strategy converge to the Nash equilibrium strategy. Finally, simulation experiments are carried out on the parameter selection of the autoencoder, the training process of the strategy generation algorithm, and the rationality and portability of the maneuver strategy. The simulation results show that the autoencoder is introduced can effectively improve the efficiency of strategy learning, and the multi-UAV short-range air combat maneuver strategy generated by this algorithm is reasonable and good portability.

KW - Air combat decision-making

KW - Fictitious self-play

KW - Multi-UAV cooperation

KW - Reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85127308447&partnerID=8YFLogxK

U2 - 10.7641/CTA.2021.10120

DO - 10.7641/CTA.2021.10120

M3 - 文章

AN - SCOPUS:85127308447

SN - 1000-8152

VL - 39

SP - 352

EP - 362

JO - Kongzhi Lilun Yu Yingyong/Control Theory and Applications

JF - Kongzhi Lilun Yu Yingyong/Control Theory and Applications

IS - 2

ER -

基于深度强化学习与自学习的多无人机近距空战机动策略生成算法

Abstract

Access to Document

Other files and links

Fingerprint

Cite this