Research on multi-UAV task decision-making based on improved MADDPG algorithm and transfer learning

Bo Li, Shiyang Liang, Zhigang Gan, Daqing Chen, Peixin Gao

科研成果: 期刊稿件文章同行评审

16 引用 (Scopus)

摘要

At present, the intelligent algorithms of multi-UAV task decision-making have been suffering some major issues, such as, slow learning speed and poor generalisation capability, and these issues have made it difficult to obtain expected learning results within a reasonable time and to apply a trained model in a new environment. To address these problems, an improved algorithm, namely PMADDPG, based on multi-agent deep deterministic policy gradient (MADDPG) is proposed in this paper. This algorithm adopts a two-layer experience pool structure in order to achieve the priority experience replay. Experiences are stored in an experience pool of the first layer, and then, experiences more conducive to training and learning are selected according to priority criteria and put into an experience pool of the second layer. Furthermore, the experiences from the experience pool of the second layer are selected for model training based on PMADDPG algorithm. In addition, a model-based environment transfer learning method is designed to improve the generalisation capability of the algorithm. Comparative experiments have shown that, compared with MADDPG algorithm, proposed algorithms can scientifically improve the learning speed, task success rate and generalisation capability.

源语言英语
页(从-至)82-91
页数10
期刊International Journal of Bio-Inspired Computation
18
2
DOI
出版状态已出版 - 2021

指纹

探究 'Research on multi-UAV task decision-making based on improved MADDPG algorithm and transfer learning' 的科研主题。它们共同构成独一无二的指纹。

引用此