深度强化学习的无人作战飞机空战机动决策

Yongfeng Li; Jingping Shi; Weiguo Zhang; Wei Jiang

doi:10.11918/202005108

深度强化学习的无人作战飞机空战机动决策

Translated title of the contribution: Maneuver decision of UCAV in air combat based on deep reinforcement learning

Yongfeng Li, Jingping Shi, Weiguo Zhang, Wei Jiang

School of Automation

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

17 Scopus citations

Abstract

When an unmanned combat aerial vehicle (UCAV) is making the decision of autonomous maneuver in air combat, it faces large-scale calculation and is susceptible to the uncertain manipulation of the enemy. To tackle such problems, a decision-making model for autonomous maneuver of UCAV in air combat was proposed based on deep reinforcement learning algorithm in this study. With this algorithm, the UCAV can autonomously make maneuver decisions during air combat to achieve dominant position. First, based on the aircraft control system, a six-degree-of-freedom UCAV model was built using MATLAB/Simulink simulation platform, and the appropriate air combat action was selected as the maneuver output. On this basis, the decision-making model for the autonomous maneuver of UCAV in air combat was designed. Through the relative movement of both sides, the operational evaluation model was constructed. The range of the missile attack area was analyzed, and the corresponding advantage function was taken as the evaluation basis of the deep reinforcement learning. Then, the UCAV was trained by stages from the easy to the difficult, and the optimal maneuver control command was analyzed by investigating the deep Q network. Thereby, the UCAV could select corresponding maneuver actions in different situations and evaluate the battlefield situation independently, making tactical decisions and achieving the purpose of improving combat effectiveness. Simulation results suggest that the proposed method can make UCAV choose the tactical action independently in air combat and reach the dominant position quickly, which greatly improves the combat efficiency of the UCAV.

Translated title of the contribution	Maneuver decision of UCAV in air combat based on deep reinforcement learning
Original language	Chinese (Traditional)
Pages (from-to)	33-41
Number of pages	9
Journal	Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology
Volume	53
Issue number	12
DOIs	https://doi.org/10.11918/202005108
State	Published - 30 Dec 2021

Access to Document

10.11918/202005108

Cite this

@article{a98da3cd01d649bbaef479efa25cd3da,

title = "深度强化学习的无人作战飞机空战机动决策",

abstract = "When an unmanned combat aerial vehicle (UCAV) is making the decision of autonomous maneuver in air combat, it faces large-scale calculation and is susceptible to the uncertain manipulation of the enemy. To tackle such problems, a decision-making model for autonomous maneuver of UCAV in air combat was proposed based on deep reinforcement learning algorithm in this study. With this algorithm, the UCAV can autonomously make maneuver decisions during air combat to achieve dominant position. First, based on the aircraft control system, a six-degree-of-freedom UCAV model was built using MATLAB/Simulink simulation platform, and the appropriate air combat action was selected as the maneuver output. On this basis, the decision-making model for the autonomous maneuver of UCAV in air combat was designed. Through the relative movement of both sides, the operational evaluation model was constructed. The range of the missile attack area was analyzed, and the corresponding advantage function was taken as the evaluation basis of the deep reinforcement learning. Then, the UCAV was trained by stages from the easy to the difficult, and the optimal maneuver control command was analyzed by investigating the deep Q network. Thereby, the UCAV could select corresponding maneuver actions in different situations and evaluate the battlefield situation independently, making tactical decisions and achieving the purpose of improving combat effectiveness. Simulation results suggest that the proposed method can make UCAV choose the tactical action independently in air combat and reach the dominant position quickly, which greatly improves the combat efficiency of the UCAV.",

keywords = "Advantage function, Autonomous maneuver decision in air combat, Deep Q network, Deep reinforcement learning, Six-degree-of-freedom, Unmanned combat aerial vehicle (UCAV)",

author = "Yongfeng Li and Jingping Shi and Weiguo Zhang and Wei Jiang",

year = "2021",

month = dec,

day = "30",

doi = "10.11918/202005108",

language = "繁体中文",

volume = "53",

pages = "33--41",

journal = "Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology",

issn = "0367-6234",

publisher = "Harbin Institute of Technology",

number = "12",

}

TY - JOUR

T1 - 深度强化学习的无人作战飞机空战机动决策

AU - Li, Yongfeng

AU - Shi, Jingping

AU - Zhang, Weiguo

AU - Jiang, Wei

PY - 2021/12/30

Y1 - 2021/12/30

N2 - When an unmanned combat aerial vehicle (UCAV) is making the decision of autonomous maneuver in air combat, it faces large-scale calculation and is susceptible to the uncertain manipulation of the enemy. To tackle such problems, a decision-making model for autonomous maneuver of UCAV in air combat was proposed based on deep reinforcement learning algorithm in this study. With this algorithm, the UCAV can autonomously make maneuver decisions during air combat to achieve dominant position. First, based on the aircraft control system, a six-degree-of-freedom UCAV model was built using MATLAB/Simulink simulation platform, and the appropriate air combat action was selected as the maneuver output. On this basis, the decision-making model for the autonomous maneuver of UCAV in air combat was designed. Through the relative movement of both sides, the operational evaluation model was constructed. The range of the missile attack area was analyzed, and the corresponding advantage function was taken as the evaluation basis of the deep reinforcement learning. Then, the UCAV was trained by stages from the easy to the difficult, and the optimal maneuver control command was analyzed by investigating the deep Q network. Thereby, the UCAV could select corresponding maneuver actions in different situations and evaluate the battlefield situation independently, making tactical decisions and achieving the purpose of improving combat effectiveness. Simulation results suggest that the proposed method can make UCAV choose the tactical action independently in air combat and reach the dominant position quickly, which greatly improves the combat efficiency of the UCAV.

AB - When an unmanned combat aerial vehicle (UCAV) is making the decision of autonomous maneuver in air combat, it faces large-scale calculation and is susceptible to the uncertain manipulation of the enemy. To tackle such problems, a decision-making model for autonomous maneuver of UCAV in air combat was proposed based on deep reinforcement learning algorithm in this study. With this algorithm, the UCAV can autonomously make maneuver decisions during air combat to achieve dominant position. First, based on the aircraft control system, a six-degree-of-freedom UCAV model was built using MATLAB/Simulink simulation platform, and the appropriate air combat action was selected as the maneuver output. On this basis, the decision-making model for the autonomous maneuver of UCAV in air combat was designed. Through the relative movement of both sides, the operational evaluation model was constructed. The range of the missile attack area was analyzed, and the corresponding advantage function was taken as the evaluation basis of the deep reinforcement learning. Then, the UCAV was trained by stages from the easy to the difficult, and the optimal maneuver control command was analyzed by investigating the deep Q network. Thereby, the UCAV could select corresponding maneuver actions in different situations and evaluate the battlefield situation independently, making tactical decisions and achieving the purpose of improving combat effectiveness. Simulation results suggest that the proposed method can make UCAV choose the tactical action independently in air combat and reach the dominant position quickly, which greatly improves the combat efficiency of the UCAV.

KW - Advantage function

KW - Autonomous maneuver decision in air combat

KW - Deep Q network

KW - Deep reinforcement learning

KW - Six-degree-of-freedom

KW - Unmanned combat aerial vehicle (UCAV)

UR - http://www.scopus.com/inward/record.url?scp=85121120998&partnerID=8YFLogxK

U2 - 10.11918/202005108

DO - 10.11918/202005108

M3 - 文章

AN - SCOPUS:85121120998

SN - 0367-6234

VL - 53

SP - 33

EP - 41

JO - Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology

JF - Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology

IS - 12

ER -

深度强化学习的无人作战飞机空战机动决策

Abstract

Access to Document

Other files and links

Fingerprint

Cite this