Abstract
The high maneuverability of modern fighters in close air combat imposes significant cognitive demands on pilots, making rapid, accurate decision-making challenging. While reinforcement learning (RL) has shown promise in this domain, the existing methods often lack strategic depth and generalization in complex, high-dimensional environments. To address these limitations, this paper proposes an optimized self-play method enhanced by advancements in fighter modeling, neural network design, and algorithmic frameworks. This study employs a six-degree-of-freedom (6-DOF) F-16 fighter model based on open-source aerodynamic data, featuring airborne equipment and a realistic visual simulation platform, unlike traditional 3-DOF models. To capture temporal dynamics, Long Short-Term Memory (LSTM) layers are integrated into the neural network, complemented by delayed input stacking. The RL environment incorporates expert strategies, curiosity-driven rewards, and curriculum learning to improve adaptability and strategic decision-making. Experimental results demonstrate that the proposed approach achieves a winning rate exceeding 90% against classical single-agent methods. Additionally, through enhanced 3D visual platforms, we conducted human-agent confrontation experiments, where the agent attained an average winning rate of over 75%. The agent's maneuver trajectories closely align with human pilot strategies, showcasing its potential in decision-making and pilot training applications. This study highlights the effectiveness of integrating advanced modeling and self-play techniques in developing robust air combat decision-making systems.
| Original language | English |
|---|---|
| Article number | 103526 |
| Journal | Chinese Journal of Aeronautics |
| Volume | 38 |
| Issue number | 9 |
| DOIs | |
| State | Published - Sep 2025 |
Keywords
- Air combat
- Decision making
- Flight simulation
- Reinforcement learning
- Self-play
Fingerprint
Dive into the research topics of 'Decision-making and confrontation in close-range air combat based on reinforcement learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver