Tactical intent-driven autonomous air combat behavior generation method

Xingyu Wang; Zhen Yang; Shiyuan Chai; Jichuan Huang; Yupeng He; Deyun Zhou

doi:10.1007/s40747-024-01685-9

Tactical intent-driven autonomous air combat behavior generation method

Xingyu Wang, Zhen Yang, Shiyuan Chai, Jichuan Huang, Yupeng He, Deyun Zhou

电子信息学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

With the rapid development and deep application of artificial intelligence, modern air combat is incrementally evolving towards intelligent combat. Although deep reinforcement learning algorithms have contributed to dramatic advances in in air combat, they still face challenges such as poor interpretability and weak transferability of adversarial strategies. In this regard, this paper proposes a tactical intent-driven method for autonomous air combat behaviour generation. Firstly, this paper explores the mapping relationship between optimal strategies and rewards, demonstrating the detrimental effects of the combination of sparse rewards and dense rewards on policy. Built around this, the decision-making process of pilot behavior is analyzed, and a reward mapping model from intent to behavior is established. Finally, to address the problems of poor stability and slow convergence speed of deep reinforcement learning algorithms in large-scale state-action spaces, the dueling-noisy-multi-step DQN algorithm is devised, which not only improves the accuracy of value function approximation but also enhances the efficiency of space exploration and network generalization. Through experiments, the conflicts between sparse rewards and dense rewards are demonstrated. The superior performance and stability of the proposed algorithm compared to other algorithms are captured by our empirical results. More intuitively, the strategies under different intents exhibit strong interpretability and flexibility, which can provide tactical support for intelligent decision-making in air combat.

源语言	英语
文章编号	65
期刊	Complex and Intelligent Systems
卷	11
期	1
DOI	https://doi.org/10.1007/s40747-024-01685-9
出版状态	已出版 - 1月 2025

访问文件

10.1007/s40747-024-01685-9

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{50c19d9f8b004163b97ab65571264cc4,

title = "Tactical intent-driven autonomous air combat behavior generation method",

abstract = "With the rapid development and deep application of artificial intelligence, modern air combat is incrementally evolving towards intelligent combat. Although deep reinforcement learning algorithms have contributed to dramatic advances in in air combat, they still face challenges such as poor interpretability and weak transferability of adversarial strategies. In this regard, this paper proposes a tactical intent-driven method for autonomous air combat behaviour generation. Firstly, this paper explores the mapping relationship between optimal strategies and rewards, demonstrating the detrimental effects of the combination of sparse rewards and dense rewards on policy. Built around this, the decision-making process of pilot behavior is analyzed, and a reward mapping model from intent to behavior is established. Finally, to address the problems of poor stability and slow convergence speed of deep reinforcement learning algorithms in large-scale state-action spaces, the dueling-noisy-multi-step DQN algorithm is devised, which not only improves the accuracy of value function approximation but also enhances the efficiency of space exploration and network generalization. Through experiments, the conflicts between sparse rewards and dense rewards are demonstrated. The superior performance and stability of the proposed algorithm compared to other algorithms are captured by our empirical results. More intuitively, the strategies under different intents exhibit strong interpretability and flexibility, which can provide tactical support for intelligent decision-making in air combat.",

keywords = "Behavioural strategies, Deep reinforcement learning, Reward design, Tactical intent",

author = "Xingyu Wang and Zhen Yang and Shiyuan Chai and Jichuan Huang and Yupeng He and Deyun Zhou",

note = "Publisher Copyright: {\textcopyright} The Author(s) 2024.",

year = "2025",

month = jan,

doi = "10.1007/s40747-024-01685-9",

language = "英语",

volume = "11",

journal = "Complex and Intelligent Systems",

issn = "2199-4536",

publisher = "Springer International Publishing AG",

number = "1",

}

TY - JOUR

T1 - Tactical intent-driven autonomous air combat behavior generation method

AU - Wang, Xingyu

AU - Yang, Zhen

AU - Chai, Shiyuan

AU - Huang, Jichuan

AU - He, Yupeng

AU - Zhou, Deyun

N1 - Publisher Copyright: © The Author(s) 2024.

PY - 2025/1

Y1 - 2025/1

N2 - With the rapid development and deep application of artificial intelligence, modern air combat is incrementally evolving towards intelligent combat. Although deep reinforcement learning algorithms have contributed to dramatic advances in in air combat, they still face challenges such as poor interpretability and weak transferability of adversarial strategies. In this regard, this paper proposes a tactical intent-driven method for autonomous air combat behaviour generation. Firstly, this paper explores the mapping relationship between optimal strategies and rewards, demonstrating the detrimental effects of the combination of sparse rewards and dense rewards on policy. Built around this, the decision-making process of pilot behavior is analyzed, and a reward mapping model from intent to behavior is established. Finally, to address the problems of poor stability and slow convergence speed of deep reinforcement learning algorithms in large-scale state-action spaces, the dueling-noisy-multi-step DQN algorithm is devised, which not only improves the accuracy of value function approximation but also enhances the efficiency of space exploration and network generalization. Through experiments, the conflicts between sparse rewards and dense rewards are demonstrated. The superior performance and stability of the proposed algorithm compared to other algorithms are captured by our empirical results. More intuitively, the strategies under different intents exhibit strong interpretability and flexibility, which can provide tactical support for intelligent decision-making in air combat.

AB - With the rapid development and deep application of artificial intelligence, modern air combat is incrementally evolving towards intelligent combat. Although deep reinforcement learning algorithms have contributed to dramatic advances in in air combat, they still face challenges such as poor interpretability and weak transferability of adversarial strategies. In this regard, this paper proposes a tactical intent-driven method for autonomous air combat behaviour generation. Firstly, this paper explores the mapping relationship between optimal strategies and rewards, demonstrating the detrimental effects of the combination of sparse rewards and dense rewards on policy. Built around this, the decision-making process of pilot behavior is analyzed, and a reward mapping model from intent to behavior is established. Finally, to address the problems of poor stability and slow convergence speed of deep reinforcement learning algorithms in large-scale state-action spaces, the dueling-noisy-multi-step DQN algorithm is devised, which not only improves the accuracy of value function approximation but also enhances the efficiency of space exploration and network generalization. Through experiments, the conflicts between sparse rewards and dense rewards are demonstrated. The superior performance and stability of the proposed algorithm compared to other algorithms are captured by our empirical results. More intuitively, the strategies under different intents exhibit strong interpretability and flexibility, which can provide tactical support for intelligent decision-making in air combat.

KW - Behavioural strategies

KW - Deep reinforcement learning

KW - Reward design

KW - Tactical intent

UR - http://www.scopus.com/inward/record.url?scp=85211379156&partnerID=8YFLogxK

U2 - 10.1007/s40747-024-01685-9

DO - 10.1007/s40747-024-01685-9

M3 - 文章

AN - SCOPUS:85211379156

SN - 2199-4536

VL - 11

JO - Complex and Intelligent Systems

JF - Complex and Intelligent Systems

IS - 1

M1 - 65

ER -

Tactical intent-driven autonomous air combat behavior generation method

摘要

访问文件

其它文件与链接

指纹

引用此