深度强化学习的无人作战飞机空战机动决策

Yongfeng Li; Jingping Shi; Weiguo Zhang; Wei Jiang

doi:10.11918/202005108

深度强化学习的无人作战飞机空战机动决策

Yongfeng Li, Jingping Shi, Weiguo Zhang, Wei Jiang

自动化学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

15 引用（Scopus）

摘要

When an unmanned combat aerial vehicle (UCAV) is making the decision of autonomous maneuver in air combat, it faces large-scale calculation and is susceptible to the uncertain manipulation of the enemy. To tackle such problems, a decision-making model for autonomous maneuver of UCAV in air combat was proposed based on deep reinforcement learning algorithm in this study. With this algorithm, the UCAV can autonomously make maneuver decisions during air combat to achieve dominant position. First, based on the aircraft control system, a six-degree-of-freedom UCAV model was built using MATLAB/Simulink simulation platform, and the appropriate air combat action was selected as the maneuver output. On this basis, the decision-making model for the autonomous maneuver of UCAV in air combat was designed. Through the relative movement of both sides, the operational evaluation model was constructed. The range of the missile attack area was analyzed, and the corresponding advantage function was taken as the evaluation basis of the deep reinforcement learning. Then, the UCAV was trained by stages from the easy to the difficult, and the optimal maneuver control command was analyzed by investigating the deep Q network. Thereby, the UCAV could select corresponding maneuver actions in different situations and evaluate the battlefield situation independently, making tactical decisions and achieving the purpose of improving combat effectiveness. Simulation results suggest that the proposed method can make UCAV choose the tactical action independently in air combat and reach the dominant position quickly, which greatly improves the combat efficiency of the UCAV.

投稿的翻译标题	Maneuver decision of UCAV in air combat based on deep reinforcement learning
源语言	繁体中文
页（从-至）	33-41
页数	9
期刊	Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology
卷	53
期	12
DOI	https://doi.org/10.11918/202005108
出版状态	已出版 - 30 12月 2021

关键词

Advantage function
Autonomous maneuver decision in air combat
Deep Q network
Deep reinforcement learning
Six-degree-of-freedom
Unmanned combat aerial vehicle (UCAV)

访问文件

10.11918/202005108

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{a98da3cd01d649bbaef479efa25cd3da,

title = "深度强化学习的无人作战飞机空战机动决策",

abstract = "When an unmanned combat aerial vehicle (UCAV) is making the decision of autonomous maneuver in air combat, it faces large-scale calculation and is susceptible to the uncertain manipulation of the enemy. To tackle such problems, a decision-making model for autonomous maneuver of UCAV in air combat was proposed based on deep reinforcement learning algorithm in this study. With this algorithm, the UCAV can autonomously make maneuver decisions during air combat to achieve dominant position. First, based on the aircraft control system, a six-degree-of-freedom UCAV model was built using MATLAB/Simulink simulation platform, and the appropriate air combat action was selected as the maneuver output. On this basis, the decision-making model for the autonomous maneuver of UCAV in air combat was designed. Through the relative movement of both sides, the operational evaluation model was constructed. The range of the missile attack area was analyzed, and the corresponding advantage function was taken as the evaluation basis of the deep reinforcement learning. Then, the UCAV was trained by stages from the easy to the difficult, and the optimal maneuver control command was analyzed by investigating the deep Q network. Thereby, the UCAV could select corresponding maneuver actions in different situations and evaluate the battlefield situation independently, making tactical decisions and achieving the purpose of improving combat effectiveness. Simulation results suggest that the proposed method can make UCAV choose the tactical action independently in air combat and reach the dominant position quickly, which greatly improves the combat efficiency of the UCAV.",

keywords = "Advantage function, Autonomous maneuver decision in air combat, Deep Q network, Deep reinforcement learning, Six-degree-of-freedom, Unmanned combat aerial vehicle (UCAV)",

author = "Yongfeng Li and Jingping Shi and Weiguo Zhang and Wei Jiang",

year = "2021",

month = dec,

day = "30",

doi = "10.11918/202005108",

language = "繁体中文",

volume = "53",

pages = "33--41",

journal = "Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology",

issn = "0367-6234",

publisher = "Harbin Institute of Technology",

number = "12",

}

TY - JOUR

T1 - 深度强化学习的无人作战飞机空战机动决策

AU - Li, Yongfeng

AU - Shi, Jingping

AU - Zhang, Weiguo

AU - Jiang, Wei

PY - 2021/12/30

Y1 - 2021/12/30

N2 - When an unmanned combat aerial vehicle (UCAV) is making the decision of autonomous maneuver in air combat, it faces large-scale calculation and is susceptible to the uncertain manipulation of the enemy. To tackle such problems, a decision-making model for autonomous maneuver of UCAV in air combat was proposed based on deep reinforcement learning algorithm in this study. With this algorithm, the UCAV can autonomously make maneuver decisions during air combat to achieve dominant position. First, based on the aircraft control system, a six-degree-of-freedom UCAV model was built using MATLAB/Simulink simulation platform, and the appropriate air combat action was selected as the maneuver output. On this basis, the decision-making model for the autonomous maneuver of UCAV in air combat was designed. Through the relative movement of both sides, the operational evaluation model was constructed. The range of the missile attack area was analyzed, and the corresponding advantage function was taken as the evaluation basis of the deep reinforcement learning. Then, the UCAV was trained by stages from the easy to the difficult, and the optimal maneuver control command was analyzed by investigating the deep Q network. Thereby, the UCAV could select corresponding maneuver actions in different situations and evaluate the battlefield situation independently, making tactical decisions and achieving the purpose of improving combat effectiveness. Simulation results suggest that the proposed method can make UCAV choose the tactical action independently in air combat and reach the dominant position quickly, which greatly improves the combat efficiency of the UCAV.

AB - When an unmanned combat aerial vehicle (UCAV) is making the decision of autonomous maneuver in air combat, it faces large-scale calculation and is susceptible to the uncertain manipulation of the enemy. To tackle such problems, a decision-making model for autonomous maneuver of UCAV in air combat was proposed based on deep reinforcement learning algorithm in this study. With this algorithm, the UCAV can autonomously make maneuver decisions during air combat to achieve dominant position. First, based on the aircraft control system, a six-degree-of-freedom UCAV model was built using MATLAB/Simulink simulation platform, and the appropriate air combat action was selected as the maneuver output. On this basis, the decision-making model for the autonomous maneuver of UCAV in air combat was designed. Through the relative movement of both sides, the operational evaluation model was constructed. The range of the missile attack area was analyzed, and the corresponding advantage function was taken as the evaluation basis of the deep reinforcement learning. Then, the UCAV was trained by stages from the easy to the difficult, and the optimal maneuver control command was analyzed by investigating the deep Q network. Thereby, the UCAV could select corresponding maneuver actions in different situations and evaluate the battlefield situation independently, making tactical decisions and achieving the purpose of improving combat effectiveness. Simulation results suggest that the proposed method can make UCAV choose the tactical action independently in air combat and reach the dominant position quickly, which greatly improves the combat efficiency of the UCAV.

KW - Advantage function

KW - Autonomous maneuver decision in air combat

KW - Deep Q network

KW - Deep reinforcement learning

KW - Six-degree-of-freedom

KW - Unmanned combat aerial vehicle (UCAV)

UR - http://www.scopus.com/inward/record.url?scp=85121120998&partnerID=8YFLogxK

U2 - 10.11918/202005108

DO - 10.11918/202005108

M3 - 文章

AN - SCOPUS:85121120998

SN - 0367-6234

VL - 53

SP - 33

EP - 41

JO - Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology

JF - Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology

IS - 12

ER -

深度强化学习的无人作战飞机空战机动决策

摘要

关键词

访问文件

其它文件与链接

指纹

引用此