PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration

Pengyi Li; Hongyao Tang; Tianpei Yang; Xiaotian Hao; Tong Sang; Yan Zheng; Jianye Hao; Matthew E. Taylor; Wenyuan Tao; Zhen Wang

PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration

Pengyi Li, Hongyao Tang, Tianpei Yang, Xiaotian Hao, Tong Sang, Yan Zheng, Jianye Hao, Matthew E. Taylor, Wenyuan Tao, Zhen Wang

School of Cybersecurity

Research output: Contribution to journal › Conference article › peer-review

20 Scopus citations

Abstract

Learning to collaborate is critical in MultiAgent Reinforcement Learning (MARL). Previous works promote collaboration by maximizing the correlation of agents' behaviors, which is typically characterized by Mutual Information (MI) in different forms. However, we reveal sub-optimal collaborative behaviors also emerge with strong correlations, and simply maximizing the MI can, surprisingly, hinder the learning towards better collaboration. To address this issue, we propose a novel MARL framework, called Progressive Mutual Information Collaboration (PMIC), for more effective MI-driven collaboration. PMIC uses a new collaboration criterion measured by the MI between global states and joint actions. Based on this criterion, the key idea of PMIC is maximizing the MI associated with superior collaborative behaviors and minimizing the MI associated with inferior ones. The two MI objectives play complementary roles by facilitating better collaborations while avoiding falling into sub-optimal ones. Experiments on a wide range of MARL benchmarks show the superior performance of PMIC compared with other algorithms.

Original language	English
Pages (from-to)	12979-12997
Number of pages	19
Journal	Proceedings of Machine Learning Research
Volume	162
State	Published - 2022
Event	39th International Conference on Machine Learning, ICML 2022 - Baltimore, United States Duration: 17 Jul 2022 → 23 Jul 2022

Cite this

@article{ebfc461d078448dfa63441095ee8df4e,

title = "PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration",

abstract = "Learning to collaborate is critical in MultiAgent Reinforcement Learning (MARL). Previous works promote collaboration by maximizing the correlation of agents' behaviors, which is typically characterized by Mutual Information (MI) in different forms. However, we reveal sub-optimal collaborative behaviors also emerge with strong correlations, and simply maximizing the MI can, surprisingly, hinder the learning towards better collaboration. To address this issue, we propose a novel MARL framework, called Progressive Mutual Information Collaboration (PMIC), for more effective MI-driven collaboration. PMIC uses a new collaboration criterion measured by the MI between global states and joint actions. Based on this criterion, the key idea of PMIC is maximizing the MI associated with superior collaborative behaviors and minimizing the MI associated with inferior ones. The two MI objectives play complementary roles by facilitating better collaborations while avoiding falling into sub-optimal ones. Experiments on a wide range of MARL benchmarks show the superior performance of PMIC compared with other algorithms.",

author = "Pengyi Li and Hongyao Tang and Tianpei Yang and Xiaotian Hao and Tong Sang and Yan Zheng and Jianye Hao and Taylor, {Matthew E.} and Wenyuan Tao and Zhen Wang",

note = "Publisher Copyright: Copyright {\^A}{\textcopyright} 2022 by the author(s); 39th International Conference on Machine Learning, ICML 2022 ; Conference date: 17-07-2022 Through 23-07-2022",

year = "2022",

language = "英语",

volume = "162",

pages = "12979--12997",

journal = "Proceedings of Machine Learning Research",

issn = "2640-3498",

publisher = "ML Research Press",

}

TY - JOUR

T1 - PMIC

T2 - 39th International Conference on Machine Learning, ICML 2022

AU - Li, Pengyi

AU - Tang, Hongyao

AU - Yang, Tianpei

AU - Hao, Xiaotian

AU - Sang, Tong

AU - Zheng, Yan

AU - Hao, Jianye

AU - Taylor, Matthew E.

AU - Tao, Wenyuan

AU - Wang, Zhen

PY - 2022

Y1 - 2022

N2 - Learning to collaborate is critical in MultiAgent Reinforcement Learning (MARL). Previous works promote collaboration by maximizing the correlation of agents' behaviors, which is typically characterized by Mutual Information (MI) in different forms. However, we reveal sub-optimal collaborative behaviors also emerge with strong correlations, and simply maximizing the MI can, surprisingly, hinder the learning towards better collaboration. To address this issue, we propose a novel MARL framework, called Progressive Mutual Information Collaboration (PMIC), for more effective MI-driven collaboration. PMIC uses a new collaboration criterion measured by the MI between global states and joint actions. Based on this criterion, the key idea of PMIC is maximizing the MI associated with superior collaborative behaviors and minimizing the MI associated with inferior ones. The two MI objectives play complementary roles by facilitating better collaborations while avoiding falling into sub-optimal ones. Experiments on a wide range of MARL benchmarks show the superior performance of PMIC compared with other algorithms.

AB - Learning to collaborate is critical in MultiAgent Reinforcement Learning (MARL). Previous works promote collaboration by maximizing the correlation of agents' behaviors, which is typically characterized by Mutual Information (MI) in different forms. However, we reveal sub-optimal collaborative behaviors also emerge with strong correlations, and simply maximizing the MI can, surprisingly, hinder the learning towards better collaboration. To address this issue, we propose a novel MARL framework, called Progressive Mutual Information Collaboration (PMIC), for more effective MI-driven collaboration. PMIC uses a new collaboration criterion measured by the MI between global states and joint actions. Based on this criterion, the key idea of PMIC is maximizing the MI associated with superior collaborative behaviors and minimizing the MI associated with inferior ones. The two MI objectives play complementary roles by facilitating better collaborations while avoiding falling into sub-optimal ones. Experiments on a wide range of MARL benchmarks show the superior performance of PMIC compared with other algorithms.

UR - http://www.scopus.com/inward/record.url?scp=85149662646&partnerID=8YFLogxK

M3 - 会议文章

AN - SCOPUS:85149662646

SN - 2640-3498

VL - 162

SP - 12979

EP - 12997

JO - Proceedings of Machine Learning Research

JF - Proceedings of Machine Learning Research

Y2 - 17 July 2022 through 23 July 2022

ER -

PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration

Abstract

Other files and links

Fingerprint

Cite this