TY - GEN
T1 - SCRL
T2 - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023
AU - Fang, Yuyang
AU - Guo, Bin
AU - Liu, Jiaqi
AU - Zhao, Kaixing
AU - Ding, Yasan
AU - Wang, Na
AU - Yu, Zhiwen
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Deep reinforcement learning outperforms humans in many tasks. However, in reality, the scenarios of agent deployment often change, and the performance of models may therefore degrade, resulting in unreasonable decisions in new scenarios. Therefore, it is important for the model to own the evolution ability in new scenarios. Although recent attempts have studied the continual training of models through reward signals or self-supervised learning, on the one hand, previously used reward signals in the target domain are generally difficult to obtain and on the other hand, the performance of self-supervised evolution is always limited. In addition, adapting to the target domain usually results in forgetting the source domain. To overcome the above problems, in this paper, we propose a self-supervised continual reinforcement learning method, called SCRL, which combines reinforcement learning tasks with self-supervised learning auxiliary tasks and weight regularizes. Specifically, in the source domain, we first jointly train different objectives of reinforcement and self-supervised learning, which enables to share the same encoder. The Fisher information matrix is then calculated to record the importance of the model parameters to the source domain. In the target domain, the encoder is used for the reinforcement learning task and is updated by self-supervised learning under the control of the Fisher regularizer. Our extensive experiments on four continuous control tasks from the DeepMind Control suite showed that in the absence of reward signals, the proposed SCRL can effectively adapt the model to the target domain without catastrophically forgetting the source domain.
AB - Deep reinforcement learning outperforms humans in many tasks. However, in reality, the scenarios of agent deployment often change, and the performance of models may therefore degrade, resulting in unreasonable decisions in new scenarios. Therefore, it is important for the model to own the evolution ability in new scenarios. Although recent attempts have studied the continual training of models through reward signals or self-supervised learning, on the one hand, previously used reward signals in the target domain are generally difficult to obtain and on the other hand, the performance of self-supervised evolution is always limited. In addition, adapting to the target domain usually results in forgetting the source domain. To overcome the above problems, in this paper, we propose a self-supervised continual reinforcement learning method, called SCRL, which combines reinforcement learning tasks with self-supervised learning auxiliary tasks and weight regularizes. Specifically, in the source domain, we first jointly train different objectives of reinforcement and self-supervised learning, which enables to share the same encoder. The Fisher information matrix is then calculated to record the importance of the model parameters to the source domain. In the target domain, the encoder is used for the reinforcement learning task and is updated by self-supervised learning under the control of the Fisher regularizer. Our extensive experiments on four continuous control tasks from the DeepMind Control suite showed that in the absence of reward signals, the proposed SCRL can effectively adapt the model to the target domain without catastrophically forgetting the source domain.
KW - continual learning
KW - domain adaptation
KW - Fisher regularizer
KW - reinforcement learning
KW - self-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85188249202&partnerID=8YFLogxK
U2 - 10.1109/AIoTSys58602.2023.00030
DO - 10.1109/AIoTSys58602.2023.00030
M3 - 会议稿件
AN - SCOPUS:85188249202
T3 - Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023
SP - 55
EP - 63
BT - Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 19 October 2023 through 22 October 2023
ER -