摘要
Air-to-air combat system is a complex multi-agent system (MAS) wherein a large number of unmanned combat aerial vehicles learn to combat with their opponents in a highly dynamic and uncertain environment. Because of the local observability of each individual, it is difficult for classical multi-agent learning methods to get effective cooperative strategies. Recently, a communication mechanism has been proposed to solve the local observability issue of MAS. However, existing methods with predefined rules easily cause an exponential increase in state–action pairs, leading to high communication costs. Taking this cue, this paper designs a graph neural network based on a two-stage graph-attention mechanism to capture the key interaction relationships and communication connections between agents in complex air-to-air combat scenarios. Based on an essential backbone multi-agent reinforcement learning method, known as Multi-Agent Proximal Policy Optimization, the proposed method with a hard- and soft-attention scheme can realize the dynamic adjustment of the communication relationship and ad hoc network of multiple agents, by cutting off the unrelated interaction connections while building the correlation importance between pair agents, concurrently. Last but not least, the experimental study in the simulation environment has validated the significance of our proposed method in solving the large-scale air-to-air combat problems.
源语言 | 英语 |
---|---|
页(从-至) | 19765-19781 |
页数 | 17 |
期刊 | Neural Computing and Applications |
卷 | 35 |
期 | 27 |
DOI | |
出版状态 | 已出版 - 9月 2023 |