Multiagent Motion Planning Based on Deep Reinforcement Learning in Complex Environments

Dingwei Wu, Kaifang Wan, Xiaoguang Gao, Zijian Hu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

When agents in a multiagent system implement motion planning in complex and dynamic environments, model-based planning algorithms have poor adaptability, while intelligent algorithms, such as MADDPG, encounter difficulty in converging when training multiple agents, and the resulting control model has poor stability and robustness. To address the above challenges, this paper proposes a mixed experience multiagent deep deterministic policy gradient algorithm referred to as ME-MADDPG. The algorithm increases the high-quality experience obtained by artificial potential field method and uses dynamic probability to sample from different replay buffers. Simulation experiments have proven that compared to MADDPG, ME-MADDPG greatly improves convergence speed, convergence effect and stability and that ME-MADDPG can efficiently provide shorter and more convenient paths for multiagent systems.

Original languageEnglish
Title of host publication2021 6th International Conference on Control and Robotics Engineering, ICCRE 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages123-128
Number of pages6
ISBN (Electronic)9780738126128
DOIs
StatePublished - 16 Apr 2021
Event6th International Conference on Control and Robotics Engineering, ICCRE 2021 - Virtual, Beijing, China
Duration: 16 Apr 202118 Apr 2021

Publication series

Name2021 6th International Conference on Control and Robotics Engineering, ICCRE 2021

Conference

Conference6th International Conference on Control and Robotics Engineering, ICCRE 2021
Country/TerritoryChina
CityVirtual, Beijing
Period16/04/2118/04/21

Keywords

  • deep reinforcement learning
  • MADDPG
  • motion planning
  • multiagent

Fingerprint

Dive into the research topics of 'Multiagent Motion Planning Based on Deep Reinforcement Learning in Complex Environments'. Together they form a unique fingerprint.

Cite this