Research on multi-UAV task decision-making based on improved MADDPG algorithm and transfer learning

Bo Li, Shiyang Liang, Zhigang Gan, Daqing Chen, Peixin Gao

Research output: Contribution to journalArticlepeer-review

16 Scopus citations

Abstract

At present, the intelligent algorithms of multi-UAV task decision-making have been suffering some major issues, such as, slow learning speed and poor generalisation capability, and these issues have made it difficult to obtain expected learning results within a reasonable time and to apply a trained model in a new environment. To address these problems, an improved algorithm, namely PMADDPG, based on multi-agent deep deterministic policy gradient (MADDPG) is proposed in this paper. This algorithm adopts a two-layer experience pool structure in order to achieve the priority experience replay. Experiences are stored in an experience pool of the first layer, and then, experiences more conducive to training and learning are selected according to priority criteria and put into an experience pool of the second layer. Furthermore, the experiences from the experience pool of the second layer are selected for model training based on PMADDPG algorithm. In addition, a model-based environment transfer learning method is designed to improve the generalisation capability of the algorithm. Comparative experiments have shown that, compared with MADDPG algorithm, proposed algorithms can scientifically improve the learning speed, task success rate and generalisation capability.

Original languageEnglish
Pages (from-to)82-91
Number of pages10
JournalInternational Journal of Bio-Inspired Computation
Volume18
Issue number2
DOIs
StatePublished - 2021

Keywords

  • Improved MADDPG algorithm
  • Multi-UAV task decision
  • Transfer learning
  • Two-layer experience pool

Fingerprint

Dive into the research topics of 'Research on multi-UAV task decision-making based on improved MADDPG algorithm and transfer learning'. Together they form a unique fingerprint.

Cite this