Progressive Prioritized Experience Replay for Multi-Agent Reinforcement Learning

Zhuoying Chen, Huiping Li, Rizhong Wang, Di Cui

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Due to the limitations of load, perception ability and communication range, single agent is difficult to meet the increasingly complex task requirements. As a result, the multi-agent reinforcement learning algorithm may attract more attention. However, the algorithm convergence becomes more difficult with the increase of the agent numbers. In this article, an efficient training framework called Progressive Prioritized Experience Replay (PPER) is proposed to resolve this problem. PPER decomposes the task scene into several similar sub-scenes with a complex degree from easy to difficult. The progressive training (PT) approach is adopted to let the agent accumulate learning experience in sub-scenes before access to the task scene, which greatly reduces the training difficulty. To verify the effectiveness of our training framework, we extended OpenAI gym to create a multi-USV confrontation environment, and the superior performance of PPER has been demonstrated in comparative tests.

Original languageEnglish
Title of host publicationProceedings of the 43rd Chinese Control Conference, CCC 2024
EditorsJing Na, Jian Sun
PublisherIEEE Computer Society
Pages8292-8296
Number of pages5
ISBN (Electronic)9789887581581
DOIs
StatePublished - 2024
Event43rd Chinese Control Conference, CCC 2024 - Kunming, China
Duration: 28 Jul 202431 Jul 2024

Publication series

NameChinese Control Conference, CCC
ISSN (Print)1934-1768
ISSN (Electronic)2161-2927

Conference

Conference43rd Chinese Control Conference, CCC 2024
Country/TerritoryChina
CityKunming
Period28/07/2431/07/24

Keywords

  • Multi-USV
  • PPER
  • Progressive Training

Fingerprint

Dive into the research topics of 'Progressive Prioritized Experience Replay for Multi-Agent Reinforcement Learning'. Together they form a unique fingerprint.

Cite this