TY - JOUR
T1 - Neighborhood-Curiosity-Based Exploration in Multiagent Reinforcement Learning
AU - Yang, Shike
AU - He, Ziming
AU - Li, Jingchen
AU - Shi, Haobin
AU - Ji, Qingbing
AU - Hwang, Kao Shing
AU - Li, Xianshan
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2025
Y1 - 2025
N2 - Efficient exploration in cooperative multiagent reinforcement learning is still tricky in complex tasks. In this article, we propose a novel multiagent collaborative exploration method called neighborhood-curiosity-based exploration (NCE), by which agents can explore not only novel states but also new partnerships. Concretely, we use the attention mechanism in graph convolutional networks to perform a weighted summation of features from neighbors. The calculated attention weights can be regarded as an embodiment of the relationship among agents. Then, we use the prediction errors of the aggregated features as intrinsic rewards to facilitate exploration. When agents encounter novel states or new partnerships, NCE will produce large prediction errors, resulting in large intrinsic rewards. In addition, agents are more influenced by their neighbors and only interact directly with them in multiagent systems. Exploring partnerships between agents and their neighbors can enable agents to capture the most important cooperative relations with other agents. Therefore, NCE can effectively promote collaborative exploration even in environments with a large number of agents. Our experimental results show that NCE achieves significant performance improvements on the challenging StarCraft II micromanagement (SMAC) benchmark.
AB - Efficient exploration in cooperative multiagent reinforcement learning is still tricky in complex tasks. In this article, we propose a novel multiagent collaborative exploration method called neighborhood-curiosity-based exploration (NCE), by which agents can explore not only novel states but also new partnerships. Concretely, we use the attention mechanism in graph convolutional networks to perform a weighted summation of features from neighbors. The calculated attention weights can be regarded as an embodiment of the relationship among agents. Then, we use the prediction errors of the aggregated features as intrinsic rewards to facilitate exploration. When agents encounter novel states or new partnerships, NCE will produce large prediction errors, resulting in large intrinsic rewards. In addition, agents are more influenced by their neighbors and only interact directly with them in multiagent systems. Exploring partnerships between agents and their neighbors can enable agents to capture the most important cooperative relations with other agents. Therefore, NCE can effectively promote collaborative exploration even in environments with a large number of agents. Our experimental results show that NCE achieves significant performance improvements on the challenging StarCraft II micromanagement (SMAC) benchmark.
KW - multiagent reinforcement learning (MARL)
KW - multiagent system
UR - http://www.scopus.com/inward/record.url?scp=105003135570&partnerID=8YFLogxK
U2 - 10.1109/TCDS.2024.3460368
DO - 10.1109/TCDS.2024.3460368
M3 - 文章
AN - SCOPUS:105003135570
SN - 2379-8920
VL - 17
SP - 379
EP - 389
JO - IEEE Transactions on Cognitive and Developmental Systems
JF - IEEE Transactions on Cognitive and Developmental Systems
IS - 2
ER -