摘要
We present a novel approach to generating a cooperative guidance strategy using deep reinforcement learning to address the challenge of cooperative multi-missile strikes under uncontrollable velocity conditions. This method employs the multi-agent proximal policy optimization (MAPPO) algorithm to construct a continuous action space framework for intelligent cooperative guidance. A heuristically reshaped reward function is designed to enhance cooperative guidance among agents, enabling effective target engagement while mitigating the low learning efficiency caused by sparse reward signals in the guidance environment. Additionally, a multi-stage curriculum learning approach is introduced to smooth agent actions, effectively reducing action oscillations arising from independent sampling in reinforcement learning. Simulation results demonstrate that the proposed deep reinforcement learning-based guidance law can successfully achieve cooperative attacks across a range of randomized initial conditions.
源语言 | 英语 |
---|---|
文章编号 | 411 |
期刊 | Aerospace |
卷 | 12 |
期 | 5 |
DOI | |
出版状态 | 已出版 - 5月 2025 |