TY - JOUR
T1 - DeRL
T2 - Coupling Decomposition in Action Space for Reinforcement Learning Task
AU - He, Ziming
AU - Li, Jingchen
AU - Wu, Fan
AU - Shi, Haobin
AU - Hwang, Kao Shing
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2024/2/1
Y1 - 2024/2/1
N2 - This paper is concerned with complex reinforcement learning tasks whose observations are difficult to characterize as appropriate inputs for policy mapping. The representation learning technique is leveraged to extract the features from the observations for optimal action generation. In literature, the action vector, which consists of actions in each dimension, is usually learned from the full features. However, we find empirically that different actions may be only highly related to small part of the features and weakly depend on the rest. Therefore, this unified learning strategy may lead to performance degradation, and a separate learning method is motivated. In this paper, we propose a novel method called Decoupled Reinforcement Learning (DeRL) that decomposes action space by replacing the policy network with decoupled sub-policy group. To cater to all types of tasks where agents. actions in different dimensions can be either weakly correlated or strongly correlated, the Bidirectional Recurrent Neural Network (Bi-RNN) is added as an essential component to capture further shared features for generating more accurate action. In this framework, the decoupled policy network maintains joint representations required by the decision of all actions in different dimensions while decreasing preference in the learning process. In addition, we give a theoretical analysis of DeRL from the perspective of information theory, which shows the difference in information loss between DeRL and others. The performance of the proposed method has been verified by contrastive experiments on 12 tasks, including Mujoco, Atari, and other popular environments.
AB - This paper is concerned with complex reinforcement learning tasks whose observations are difficult to characterize as appropriate inputs for policy mapping. The representation learning technique is leveraged to extract the features from the observations for optimal action generation. In literature, the action vector, which consists of actions in each dimension, is usually learned from the full features. However, we find empirically that different actions may be only highly related to small part of the features and weakly depend on the rest. Therefore, this unified learning strategy may lead to performance degradation, and a separate learning method is motivated. In this paper, we propose a novel method called Decoupled Reinforcement Learning (DeRL) that decomposes action space by replacing the policy network with decoupled sub-policy group. To cater to all types of tasks where agents. actions in different dimensions can be either weakly correlated or strongly correlated, the Bidirectional Recurrent Neural Network (Bi-RNN) is added as an essential component to capture further shared features for generating more accurate action. In this framework, the decoupled policy network maintains joint representations required by the decision of all actions in different dimensions while decreasing preference in the learning process. In addition, we give a theoretical analysis of DeRL from the perspective of information theory, which shows the difference in information loss between DeRL and others. The performance of the proposed method has been verified by contrastive experiments on 12 tasks, including Mujoco, Atari, and other popular environments.
KW - Action space
KW - deep reinforcement learning
KW - multi-agent reinforcement learning
KW - policy optimization
KW - representation learning
UR - http://www.scopus.com/inward/record.url?scp=85181827732&partnerID=8YFLogxK
U2 - 10.1109/TETCI.2023.3326551
DO - 10.1109/TETCI.2023.3326551
M3 - 文章
AN - SCOPUS:85181827732
SN - 2471-285X
VL - 8
SP - 1030
EP - 1043
JO - IEEE Transactions on Emerging Topics in Computational Intelligence
JF - IEEE Transactions on Emerging Topics in Computational Intelligence
IS - 1
ER -