TY - GEN
T1 - Progressive Prioritized Experience Replay for Multi-Agent Reinforcement Learning
AU - Chen, Zhuoying
AU - Li, Huiping
AU - Wang, Rizhong
AU - Cui, Di
N1 - Publisher Copyright:
© 2024 Technical Committee on Control Theory, Chinese Association of Automation.
PY - 2024
Y1 - 2024
N2 - Due to the limitations of load, perception ability and communication range, single agent is difficult to meet the increasingly complex task requirements. As a result, the multi-agent reinforcement learning algorithm may attract more attention. However, the algorithm convergence becomes more difficult with the increase of the agent numbers. In this article, an efficient training framework called Progressive Prioritized Experience Replay (PPER) is proposed to resolve this problem. PPER decomposes the task scene into several similar sub-scenes with a complex degree from easy to difficult. The progressive training (PT) approach is adopted to let the agent accumulate learning experience in sub-scenes before access to the task scene, which greatly reduces the training difficulty. To verify the effectiveness of our training framework, we extended OpenAI gym to create a multi-USV confrontation environment, and the superior performance of PPER has been demonstrated in comparative tests.
AB - Due to the limitations of load, perception ability and communication range, single agent is difficult to meet the increasingly complex task requirements. As a result, the multi-agent reinforcement learning algorithm may attract more attention. However, the algorithm convergence becomes more difficult with the increase of the agent numbers. In this article, an efficient training framework called Progressive Prioritized Experience Replay (PPER) is proposed to resolve this problem. PPER decomposes the task scene into several similar sub-scenes with a complex degree from easy to difficult. The progressive training (PT) approach is adopted to let the agent accumulate learning experience in sub-scenes before access to the task scene, which greatly reduces the training difficulty. To verify the effectiveness of our training framework, we extended OpenAI gym to create a multi-USV confrontation environment, and the superior performance of PPER has been demonstrated in comparative tests.
KW - Multi-USV
KW - PPER
KW - Progressive Training
UR - http://www.scopus.com/inward/record.url?scp=85205494756&partnerID=8YFLogxK
U2 - 10.23919/CCC63176.2024.10661678
DO - 10.23919/CCC63176.2024.10661678
M3 - 会议稿件
AN - SCOPUS:85205494756
T3 - Chinese Control Conference, CCC
SP - 8292
EP - 8296
BT - Proceedings of the 43rd Chinese Control Conference, CCC 2024
A2 - Na, Jing
A2 - Sun, Jian
PB - IEEE Computer Society
T2 - 43rd Chinese Control Conference, CCC 2024
Y2 - 28 July 2024 through 31 July 2024
ER -