Learning controlled and targeted communication with the centralized critic for the multi-agent system

Qingshuang Sun, Yuan Yao, Peng Yi, Yu Jiao Hu, Zhao Yang, Gang Yang, Xingshe Zhou

科研成果: 期刊稿件文章同行评审

4 引用 (Scopus)

摘要

Multi-agent deep reinforcement learning (MDRL) has attracted attention for solving complex tasks. Two main challenges of MDRL are non-stationarity and partial observability from the perspective of agents, impacting the performance of agents’ learning cooperative policies. In this study, Controlled and Targeted Communication with the Centralized Critic (COTAC) is proposed, thereby constructing the paradigm of centralized learning and decentralized execution with partial communication. It is capable of decoupling how the MAS obtains environmental information during training and execution. Specifically, COTAC can make the environment faced by agents to be stationarity in the training phase and learn partial communication to overcome the limitation of partial observability in the execution phase. Based on this, decentralized actors learn controlled and targeted communication and policies optimized by centralized critics during training. As a result, agents comprehensively learn when to communicate during the sending and how to target information aggregation during the receiving. Apart from that, COTAC is evaluated on two multi-agent scenarios with continuous space. Experimental results demonstrated that partial agents with important information choose to send messages and targeted aggregate received information by identifying the relevant important information, which can still have better cooperation performance while reducing the communication traffic of the system.

源语言英语
页(从-至)14819-14837
页数19
期刊Applied Intelligence
53
12
DOI
出版状态已出版 - 6月 2023

指纹

探究 'Learning controlled and targeted communication with the centralized critic for the multi-agent system' 的科研主题。它们共同构成独一无二的指纹。

引用此