TY - GEN
T1 - Hierarchical Intent-Driven Air Combat Decision Making with Bidirectional Temporal Modeling
AU - Kong, Weiren
AU - Liu, Zhijun
AU - Yuan, Xin
AU - Wang, Xingyu
AU - Li, Shaowei
AU - Zhou, Deyun
N1 - Publisher Copyright:
©2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Modern air combat environments are characterized by high dynamics, strong confrontation, and incomplete information, which place higher demands on the tactical decision-making capabilities of intelligent agents. Although reinforcement learning has been widely applied, existing methods often suffer from issues such as poor interpretability, limited generalization ability, and unstable convergence. Meanwhile, the temporal correlation of air combat states has not been sufficiently explored, and the lack of structured decomposition makes it difficult for agents to flexibly switch between diverse tactical intentions. To address these challenges, this paper proposess a hierarchical intention-driven air combat strategy optimization method based on time series modeling. This method draws on the pilot’s”intention-behavior” decision logic and constructs a hierarchical architecture that decouples high-level tactical intention discrimination from low-level action control, making strategy execution more flexible and controllable. In state modeling, the bidirectional long short-term memory network is introduced to extract time series feature information to enhance the model’s perception of situation changes. Simulation experimental results show that this method is superior to traditional methods in terms of confrontation performance, stability, and adaptability.
AB - Modern air combat environments are characterized by high dynamics, strong confrontation, and incomplete information, which place higher demands on the tactical decision-making capabilities of intelligent agents. Although reinforcement learning has been widely applied, existing methods often suffer from issues such as poor interpretability, limited generalization ability, and unstable convergence. Meanwhile, the temporal correlation of air combat states has not been sufficiently explored, and the lack of structured decomposition makes it difficult for agents to flexibly switch between diverse tactical intentions. To address these challenges, this paper proposess a hierarchical intention-driven air combat strategy optimization method based on time series modeling. This method draws on the pilot’s”intention-behavior” decision logic and constructs a hierarchical architecture that decouples high-level tactical intention discrimination from low-level action control, making strategy execution more flexible and controllable. In state modeling, the bidirectional long short-term memory network is introduced to extract time series feature information to enhance the model’s perception of situation changes. Simulation experimental results show that this method is superior to traditional methods in terms of confrontation performance, stability, and adaptability.
KW - air combat confrontation
KW - hierarchical reinforcement learning
KW - temporal neural network
UR - https://www.scopus.com/pages/publications/105034721971
U2 - 10.1109/MLNLP66797.2025.11389087
DO - 10.1109/MLNLP66797.2025.11389087
M3 - 会议稿件
AN - SCOPUS:105034721971
T3 - Conference Proceedings - International Conference on Machine Learning and Natural Language Processing, MLNLP 2025
BT - Conference Proceedings - International Conference on Machine Learning and Natural Language Processing, MLNLP 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 8th International Conference on Machine Learning and Natural Language Processing, MLNLP 2025
Y2 - 7 November 2025 through 9 November 2025
ER -