DeRL: Coupling Decomposition in Action Space for Reinforcement Learning Task

Ziming He; Jingchen Li; Fan Wu; Haobin Shi; Kao Shing Hwang

doi:10.1109/TETCI.2023.3326551

DeRL: Coupling Decomposition in Action Space for Reinforcement Learning Task

Ziming He, Jingchen Li, Fan Wu, Haobin Shi, Kao Shing Hwang

计算机学院

科研成果: 期刊稿件 › 文章 › 同行评审

9 引用（Scopus）

摘要

This paper is concerned with complex reinforcement learning tasks whose observations are difficult to characterize as appropriate inputs for policy mapping. The representation learning technique is leveraged to extract the features from the observations for optimal action generation. In literature, the action vector, which consists of actions in each dimension, is usually learned from the full features. However, we find empirically that different actions may be only highly related to small part of the features and weakly depend on the rest. Therefore, this unified learning strategy may lead to performance degradation, and a separate learning method is motivated. In this paper, we propose a novel method called Decoupled Reinforcement Learning (DeRL) that decomposes action space by replacing the policy network with decoupled sub-policy group. To cater to all types of tasks where agents. actions in different dimensions can be either weakly correlated or strongly correlated, the Bidirectional Recurrent Neural Network (Bi-RNN) is added as an essential component to capture further shared features for generating more accurate action. In this framework, the decoupled policy network maintains joint representations required by the decision of all actions in different dimensions while decreasing preference in the learning process. In addition, we give a theoretical analysis of DeRL from the perspective of information theory, which shows the difference in information loss between DeRL and others. The performance of the proposed method has been verified by contrastive experiments on 12 tasks, including Mujoco, Atari, and other popular environments.

源语言	英语
页（从-至）	1030-1043
页数	14
期刊	IEEE Transactions on Emerging Topics in Computational Intelligence
卷	8
期	1
DOI	https://doi.org/10.1109/TETCI.2023.3326551
出版状态	已出版 - 1 2月 2024

访问文件

10.1109/TETCI.2023.3326551

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{b89b11c86d6d408d95d639bbb8cf3b24,

title = "DeRL: Coupling Decomposition in Action Space for Reinforcement Learning Task",

abstract = "This paper is concerned with complex reinforcement learning tasks whose observations are difficult to characterize as appropriate inputs for policy mapping. The representation learning technique is leveraged to extract the features from the observations for optimal action generation. In literature, the action vector, which consists of actions in each dimension, is usually learned from the full features. However, we find empirically that different actions may be only highly related to small part of the features and weakly depend on the rest. Therefore, this unified learning strategy may lead to performance degradation, and a separate learning method is motivated. In this paper, we propose a novel method called Decoupled Reinforcement Learning (DeRL) that decomposes action space by replacing the policy network with decoupled sub-policy group. To cater to all types of tasks where agents. actions in different dimensions can be either weakly correlated or strongly correlated, the Bidirectional Recurrent Neural Network (Bi-RNN) is added as an essential component to capture further shared features for generating more accurate action. In this framework, the decoupled policy network maintains joint representations required by the decision of all actions in different dimensions while decreasing preference in the learning process. In addition, we give a theoretical analysis of DeRL from the perspective of information theory, which shows the difference in information loss between DeRL and others. The performance of the proposed method has been verified by contrastive experiments on 12 tasks, including Mujoco, Atari, and other popular environments.",

keywords = "Action space, deep reinforcement learning, multi-agent reinforcement learning, policy optimization, representation learning",

author = "Ziming He and Jingchen Li and Fan Wu and Haobin Shi and Hwang, {Kao Shing}",

note = "Publisher Copyright: {\textcopyright} 2017 IEEE.",

year = "2024",

month = feb,

day = "1",

doi = "10.1109/TETCI.2023.3326551",

language = "英语",

volume = "8",

pages = "1030--1043",

journal = "IEEE Transactions on Emerging Topics in Computational Intelligence",

issn = "2471-285X",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "1",

}

TY - JOUR

T1 - DeRL

T2 - Coupling Decomposition in Action Space for Reinforcement Learning Task

AU - He, Ziming

AU - Li, Jingchen

AU - Wu, Fan

AU - Shi, Haobin

AU - Hwang, Kao Shing

PY - 2024/2/1

Y1 - 2024/2/1

N2 - This paper is concerned with complex reinforcement learning tasks whose observations are difficult to characterize as appropriate inputs for policy mapping. The representation learning technique is leveraged to extract the features from the observations for optimal action generation. In literature, the action vector, which consists of actions in each dimension, is usually learned from the full features. However, we find empirically that different actions may be only highly related to small part of the features and weakly depend on the rest. Therefore, this unified learning strategy may lead to performance degradation, and a separate learning method is motivated. In this paper, we propose a novel method called Decoupled Reinforcement Learning (DeRL) that decomposes action space by replacing the policy network with decoupled sub-policy group. To cater to all types of tasks where agents. actions in different dimensions can be either weakly correlated or strongly correlated, the Bidirectional Recurrent Neural Network (Bi-RNN) is added as an essential component to capture further shared features for generating more accurate action. In this framework, the decoupled policy network maintains joint representations required by the decision of all actions in different dimensions while decreasing preference in the learning process. In addition, we give a theoretical analysis of DeRL from the perspective of information theory, which shows the difference in information loss between DeRL and others. The performance of the proposed method has been verified by contrastive experiments on 12 tasks, including Mujoco, Atari, and other popular environments.

AB - This paper is concerned with complex reinforcement learning tasks whose observations are difficult to characterize as appropriate inputs for policy mapping. The representation learning technique is leveraged to extract the features from the observations for optimal action generation. In literature, the action vector, which consists of actions in each dimension, is usually learned from the full features. However, we find empirically that different actions may be only highly related to small part of the features and weakly depend on the rest. Therefore, this unified learning strategy may lead to performance degradation, and a separate learning method is motivated. In this paper, we propose a novel method called Decoupled Reinforcement Learning (DeRL) that decomposes action space by replacing the policy network with decoupled sub-policy group. To cater to all types of tasks where agents. actions in different dimensions can be either weakly correlated or strongly correlated, the Bidirectional Recurrent Neural Network (Bi-RNN) is added as an essential component to capture further shared features for generating more accurate action. In this framework, the decoupled policy network maintains joint representations required by the decision of all actions in different dimensions while decreasing preference in the learning process. In addition, we give a theoretical analysis of DeRL from the perspective of information theory, which shows the difference in information loss between DeRL and others. The performance of the proposed method has been verified by contrastive experiments on 12 tasks, including Mujoco, Atari, and other popular environments.

KW - Action space

KW - deep reinforcement learning

KW - multi-agent reinforcement learning

KW - policy optimization

KW - representation learning

UR - http://www.scopus.com/inward/record.url?scp=85181827732&partnerID=8YFLogxK

U2 - 10.1109/TETCI.2023.3326551

DO - 10.1109/TETCI.2023.3326551

M3 - 文章

AN - SCOPUS:85181827732

SN - 2471-285X

VL - 8

SP - 1030

EP - 1043

JO - IEEE Transactions on Emerging Topics in Computational Intelligence

JF - IEEE Transactions on Emerging Topics in Computational Intelligence

IS - 1

ER -

DeRL: Coupling Decomposition in Action Space for Reinforcement Learning Task

摘要

访问文件

其它文件与链接

指纹

引用此