跳到主要导航 跳到搜索 跳到主要内容

Mutual Information-Guided Subtask Selection for Zero-Shot Generalization in Multi-Agent Reinforcement Learning

  • Shijie Wu
  • , Yuan Yao
  • , Wenqi Zhang
  • , Yining Zhu
  • , Yujiao Hu
  • , Gang Yang
  • , Xingshe Zhou
  • Northwestern Polytechnical University Xian

科研成果: 期刊稿件会议文章同行评审

摘要

Modular methods, which decompose complex joint policies into function-specific sub-policies, have been widely adopted to enhance asymptotic performance in single-task cooperative multi-agent reinforcement learning (MARL). However, modular policies trained on source tasks often struggle to generalize to unseen scenarios due to variations across tasks, such as mismatched action spaces and divergent state dynamics. To address this challenge, we propose Mutual Information-Guided Subtask Selection(MIGSS), a novel framework that enhances zero-shot generalization in MARL through two key innovations: a Discriminative Group Trajectory Encoder and Global Attention-Driven Coordination. Specifically, the Discriminative Group Trajectory Encoder remaps agent trajectories by maximizing mutual information between agent trajectories and dynamically assigned groups. This optimizes cross-task consistent group trajectory with broader embedding distributions. This encourages agents in distinct states to select specialized subtasks, effectively promoting functional modularity. Meanwhile, the Global Attention-Driven Coordination employs a global attention mechanism to integrate state information, coordinating group trajectories for expressive credit assignment. Extensive experiments in StarCraft II cooperative scenarios demonstrate that MIGSS significantly outperforms superior zero-shot generalization baselines in both single-task and multi-task settings.Visualization analyses confirm that the learned group trajectories successfully disperse agent trajectories into a consistent and broader embedding space, thereby enhancing subtask modularization.

指纹

探究 'Mutual Information-Guided Subtask Selection for Zero-Shot Generalization in Multi-Agent Reinforcement Learning' 的科研主题。它们共同构成独一无二的指纹。

引用此