TY - GEN
T1 - Multi-robot Cooperative Navigation Method based on Multi-agent Reinforcement Learning in Sparse Reward Tasks
AU - Li, Kai
AU - Wang, Quanhu
AU - Gong, Mengyao
AU - Li, Jiahui
AU - Shi, Haobin
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Multi-robot systems can collaborate to accomplish more complex tasks than a single robot. Cooperative navigation is the basis for multi-robot to complete rescue, reconnaissance, and other tasks in high-risk areas instead of human beings. Multi-agent reinforcement learning (MARL) is the most effective method to control multi-robot cooperation, but the sparsity of rewards limits its application in real scenarios. In this paper, a curiosity-inspired MARL approach which is called CIMADDPG is proposed to promote robot exploration. The global curiosity allocation mechanism is designed to determine each agent's contribution to the global reward. In addition, to ensure that the collaboration of agents is not lost during exploration, the dual critic network is designed to guide the update of the policy network jointly. Finally, the performance of the proposed method is verified in a multi-agent particle environment (MPE) and multi-robot (Turtlebot3) cooperative navigation simulation environment. The experimental results show that CIMADDPG improves the performance of SOTA by 23.53% 48.84% and achieves a high success rate in multi-robot cooperative navigation.
AB - Multi-robot systems can collaborate to accomplish more complex tasks than a single robot. Cooperative navigation is the basis for multi-robot to complete rescue, reconnaissance, and other tasks in high-risk areas instead of human beings. Multi-agent reinforcement learning (MARL) is the most effective method to control multi-robot cooperation, but the sparsity of rewards limits its application in real scenarios. In this paper, a curiosity-inspired MARL approach which is called CIMADDPG is proposed to promote robot exploration. The global curiosity allocation mechanism is designed to determine each agent's contribution to the global reward. In addition, to ensure that the collaboration of agents is not lost during exploration, the dual critic network is designed to guide the update of the policy network jointly. Finally, the performance of the proposed method is verified in a multi-agent particle environment (MPE) and multi-robot (Turtlebot3) cooperative navigation simulation environment. The experimental results show that CIMADDPG improves the performance of SOTA by 23.53% 48.84% and achieves a high success rate in multi-robot cooperative navigation.
KW - collaborative navigation
KW - deep reinforcement learning
KW - muliti-robot
KW - multi-agent reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85175613071&partnerID=8YFLogxK
U2 - 10.1109/ISCEIC59030.2023.10271221
DO - 10.1109/ISCEIC59030.2023.10271221
M3 - 会议稿件
AN - SCOPUS:85175613071
T3 - 2023 4th International Symposium on Computer Engineering and Intelligent Communications, ISCEIC 2023
SP - 257
EP - 261
BT - 2023 4th International Symposium on Computer Engineering and Intelligent Communications, ISCEIC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th International Symposium on Computer Engineering and Intelligent Communications, ISCEIC 2023
Y2 - 18 August 2023 through 20 August 2023
ER -