Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

Haoran He; Chenjia Bai; Kang Xu; Zhuoran Yang; Weinan Zhang; Dong Wang; Bin Zhao; Xuelong Li

Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

Haoran He, Chenjia Bai, Kang Xu, Zhuoran Yang, Weinan Zhang, Dong Wang, Bin Zhao, Xuelong Li

光电与智能研究院

科研成果: 期刊稿件 › 会议文章 › 同行评审

25 引用（Scopus）

摘要

Diffusion models have demonstrated highly-expressive generative capabilities in vision and NLP. Recent studies in reinforcement learning (RL) have shown that diffusion models are also powerful in modeling complex policies or trajectories in offline datasets. However, these works have been limited to single-task settings where a generalist agent capable of addressing multi-task predicaments is absent. In this paper, we aim to investigate the effectiveness of a single diffusion model in modeling large-scale multi-task offline data, which can be challenging due to diverse and multimodal data distribution. Specifically, we propose Multi-Task Diffusion Model (MTDIFF), a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis in multitask offline settings. MTDIFF leverages vast amounts of knowledge available in multi-task data and performs implicit knowledge sharing among tasks. For generative planning, we find MTDIFF outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D. For data synthesis, MTDIFF generates high-quality data for testing tasks given a single demonstration as a prompt, which enhances the low-quality datasets for even unseen tasks.

源语言	英语
期刊	Advances in Neural Information Processing Systems
卷	36
出版状态	已出版 - 2023
活动	37th Conference on Neural Information Processing Systems, NeurIPS 2023 - New Orleans, 美国期限: 10 12月 2023 → 16 12月 2023

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{bbc1a4a90e3c480a991537b328045f28,

title = "Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning",

abstract = "Diffusion models have demonstrated highly-expressive generative capabilities in vision and NLP. Recent studies in reinforcement learning (RL) have shown that diffusion models are also powerful in modeling complex policies or trajectories in offline datasets. However, these works have been limited to single-task settings where a generalist agent capable of addressing multi-task predicaments is absent. In this paper, we aim to investigate the effectiveness of a single diffusion model in modeling large-scale multi-task offline data, which can be challenging due to diverse and multimodal data distribution. Specifically, we propose Multi-Task Diffusion Model (MTDIFF), a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis in multitask offline settings. MTDIFF leverages vast amounts of knowledge available in multi-task data and performs implicit knowledge sharing among tasks. For generative planning, we find MTDIFF outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D. For data synthesis, MTDIFF generates high-quality data for testing tasks given a single demonstration as a prompt, which enhances the low-quality datasets for even unseen tasks.",

author = "Haoran He and Chenjia Bai and Kang Xu and Zhuoran Yang and Weinan Zhang and Dong Wang and Bin Zhao and Xuelong Li",

note = "Publisher Copyright: {\textcopyright} 2023 Neural information processing systems foundation. All rights reserved.; 37th Conference on Neural Information Processing Systems, NeurIPS 2023 ; Conference date: 10-12-2023 Through 16-12-2023",

year = "2023",

language = "英语",

volume = "36",

journal = "Advances in Neural Information Processing Systems",

issn = "1049-5258",

publisher = "Neural information processing systems foundation",

}

TY - JOUR

T1 - Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

AU - He, Haoran

AU - Bai, Chenjia

AU - Xu, Kang

AU - Yang, Zhuoran

AU - Zhang, Weinan

AU - Wang, Dong

AU - Zhao, Bin

AU - Li, Xuelong

PY - 2023

Y1 - 2023

N2 - Diffusion models have demonstrated highly-expressive generative capabilities in vision and NLP. Recent studies in reinforcement learning (RL) have shown that diffusion models are also powerful in modeling complex policies or trajectories in offline datasets. However, these works have been limited to single-task settings where a generalist agent capable of addressing multi-task predicaments is absent. In this paper, we aim to investigate the effectiveness of a single diffusion model in modeling large-scale multi-task offline data, which can be challenging due to diverse and multimodal data distribution. Specifically, we propose Multi-Task Diffusion Model (MTDIFF), a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis in multitask offline settings. MTDIFF leverages vast amounts of knowledge available in multi-task data and performs implicit knowledge sharing among tasks. For generative planning, we find MTDIFF outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D. For data synthesis, MTDIFF generates high-quality data for testing tasks given a single demonstration as a prompt, which enhances the low-quality datasets for even unseen tasks.

AB - Diffusion models have demonstrated highly-expressive generative capabilities in vision and NLP. Recent studies in reinforcement learning (RL) have shown that diffusion models are also powerful in modeling complex policies or trajectories in offline datasets. However, these works have been limited to single-task settings where a generalist agent capable of addressing multi-task predicaments is absent. In this paper, we aim to investigate the effectiveness of a single diffusion model in modeling large-scale multi-task offline data, which can be challenging due to diverse and multimodal data distribution. Specifically, we propose Multi-Task Diffusion Model (MTDIFF), a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis in multitask offline settings. MTDIFF leverages vast amounts of knowledge available in multi-task data and performs implicit knowledge sharing among tasks. For generative planning, we find MTDIFF outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D. For data synthesis, MTDIFF generates high-quality data for testing tasks given a single demonstration as a prompt, which enhances the low-quality datasets for even unseen tasks.

UR - http://www.scopus.com/inward/record.url?scp=85180301327&partnerID=8YFLogxK

M3 - 会议文章

AN - SCOPUS:85180301327

SN - 1049-5258

VL - 36

JO - Advances in Neural Information Processing Systems

JF - Advances in Neural Information Processing Systems

T2 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023

Y2 - 10 December 2023 through 16 December 2023

ER -

Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

摘要

其它文件与链接

指纹

引用此