基于深度强化学习与自学习的多无人机近距空战机动策略生成算法

Wei Ren Kong; De Yun Zhou; Yi Yang Zhao; Wan Sha Yang

doi:10.7641/CTA.2021.10120

基于深度强化学习与自学习的多无人机近距空战机动策略生成算法

Wei Ren Kong, De Yun Zhou, Yi Yang Zhao, Wan Sha Yang

电子信息学院

科研成果: 期刊稿件 › 文章 › 同行评审

14 引用（Scopus）

摘要

In order to solve the problem of multi-UAV close-range air combat maneuvering decision-making, a multi- UAV close-range air combat maneuvering strategy generation algorithm based on parameter sharing Q network and neural fictitious self-play is proposed. Firstly, a hybrid Markov game model suitable for different UAV formation sizes and a reinforcement learning framework for generating maneuvering decision strategies of multi-UAV are designed-parameter sharing Q network, and the state space is compressed through the autoencoder to improve the efficiency of strategy learning. Then, using the neural fictitious self-play makes the maneuver strategy converge to the Nash equilibrium strategy. Finally, simulation experiments are carried out on the parameter selection of the autoencoder, the training process of the strategy generation algorithm, and the rationality and portability of the maneuver strategy. The simulation results show that the autoencoder is introduced can effectively improve the efficiency of strategy learning, and the multi-UAV short-range air combat maneuver strategy generated by this algorithm is reasonable and good portability.

投稿的翻译标题	Maneuvering strategy generation algorithm for multi-UAV in close-range air combat based on deep reinforcement learning and self-play
源语言	繁体中文
页（从-至）	352-362
页数	11
期刊	Kongzhi Lilun Yu Yingyong/Control Theory and Applications
卷	39
期	2
DOI	https://doi.org/10.7641/CTA.2021.10120
出版状态	已出版 - 2月 2022

关键词

Air combat decision-making
Fictitious self-play
Multi-UAV cooperation
Reinforcement learning

访问文件

10.7641/CTA.2021.10120

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{6f120c9930ec4933847534f2f0cb2b2c,

title = "基于深度强化学习与自学习的多无人机近距空战机动策略生成算法",

abstract = "In order to solve the problem of multi-UAV close-range air combat maneuvering decision-making, a multi- UAV close-range air combat maneuvering strategy generation algorithm based on parameter sharing Q network and neural fictitious self-play is proposed. Firstly, a hybrid Markov game model suitable for different UAV formation sizes and a reinforcement learning framework for generating maneuvering decision strategies of multi-UAV are designed-parameter sharing Q network, and the state space is compressed through the autoencoder to improve the efficiency of strategy learning. Then, using the neural fictitious self-play makes the maneuver strategy converge to the Nash equilibrium strategy. Finally, simulation experiments are carried out on the parameter selection of the autoencoder, the training process of the strategy generation algorithm, and the rationality and portability of the maneuver strategy. The simulation results show that the autoencoder is introduced can effectively improve the efficiency of strategy learning, and the multi-UAV short-range air combat maneuver strategy generated by this algorithm is reasonable and good portability.",

keywords = "Air combat decision-making, Fictitious self-play, Multi-UAV cooperation, Reinforcement learning",

author = "Kong, {Wei Ren} and Zhou, {De Yun} and Zhao, {Yi Yang} and Yang, {Wan Sha}",

year = "2022",

month = feb,

doi = "10.7641/CTA.2021.10120",

language = "繁体中文",

volume = "39",

pages = "352--362",

journal = "Kongzhi Lilun Yu Yingyong/Control Theory and Applications",

issn = "1000-8152",

publisher = "South China University of Technology",

number = "2",

}

TY - JOUR

T1 - 基于深度强化学习与自学习的多无人机近距空战机动策略生成算法

AU - Kong, Wei Ren

AU - Zhou, De Yun

AU - Zhao, Yi Yang

AU - Yang, Wan Sha

PY - 2022/2

Y1 - 2022/2

N2 - In order to solve the problem of multi-UAV close-range air combat maneuvering decision-making, a multi- UAV close-range air combat maneuvering strategy generation algorithm based on parameter sharing Q network and neural fictitious self-play is proposed. Firstly, a hybrid Markov game model suitable for different UAV formation sizes and a reinforcement learning framework for generating maneuvering decision strategies of multi-UAV are designed-parameter sharing Q network, and the state space is compressed through the autoencoder to improve the efficiency of strategy learning. Then, using the neural fictitious self-play makes the maneuver strategy converge to the Nash equilibrium strategy. Finally, simulation experiments are carried out on the parameter selection of the autoencoder, the training process of the strategy generation algorithm, and the rationality and portability of the maneuver strategy. The simulation results show that the autoencoder is introduced can effectively improve the efficiency of strategy learning, and the multi-UAV short-range air combat maneuver strategy generated by this algorithm is reasonable and good portability.

AB - In order to solve the problem of multi-UAV close-range air combat maneuvering decision-making, a multi- UAV close-range air combat maneuvering strategy generation algorithm based on parameter sharing Q network and neural fictitious self-play is proposed. Firstly, a hybrid Markov game model suitable for different UAV formation sizes and a reinforcement learning framework for generating maneuvering decision strategies of multi-UAV are designed-parameter sharing Q network, and the state space is compressed through the autoencoder to improve the efficiency of strategy learning. Then, using the neural fictitious self-play makes the maneuver strategy converge to the Nash equilibrium strategy. Finally, simulation experiments are carried out on the parameter selection of the autoencoder, the training process of the strategy generation algorithm, and the rationality and portability of the maneuver strategy. The simulation results show that the autoencoder is introduced can effectively improve the efficiency of strategy learning, and the multi-UAV short-range air combat maneuver strategy generated by this algorithm is reasonable and good portability.

KW - Air combat decision-making

KW - Fictitious self-play

KW - Multi-UAV cooperation

KW - Reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85127308447&partnerID=8YFLogxK

U2 - 10.7641/CTA.2021.10120

DO - 10.7641/CTA.2021.10120

M3 - 文章

AN - SCOPUS:85127308447

SN - 1000-8152

VL - 39

SP - 352

EP - 362

JO - Kongzhi Lilun Yu Yingyong/Control Theory and Applications

JF - Kongzhi Lilun Yu Yingyong/Control Theory and Applications

IS - 2

ER -

基于深度强化学习与自学习的多无人机近距空战机动策略生成算法

摘要

关键词

访问文件

其它文件与链接

指纹

引用此