SCRL: Self-supervised Continual Reinforcement Learning for Domain Adaptation

Yuyang Fang; Bin Guo; Jiaqi Liu; Kaixing Zhao; Yasan Ding; Na Wang; Zhiwen Yu

doi:10.1109/AIoTSys58602.2023.00030

SCRL: Self-supervised Continual Reinforcement Learning for Domain Adaptation

Yuyang Fang, Bin Guo, Jiaqi Liu, Kaixing Zhao, Yasan Ding, Na Wang, Zhiwen Yu

School of Computer Science

Northwestern Polytechnical University Xian

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

1 Scopus citations

Abstract

Deep reinforcement learning outperforms humans in many tasks. However, in reality, the scenarios of agent deployment often change, and the performance of models may therefore degrade, resulting in unreasonable decisions in new scenarios. Therefore, it is important for the model to own the evolution ability in new scenarios. Although recent attempts have studied the continual training of models through reward signals or self-supervised learning, on the one hand, previously used reward signals in the target domain are generally difficult to obtain and on the other hand, the performance of self-supervised evolution is always limited. In addition, adapting to the target domain usually results in forgetting the source domain. To overcome the above problems, in this paper, we propose a self-supervised continual reinforcement learning method, called SCRL, which combines reinforcement learning tasks with self-supervised learning auxiliary tasks and weight regularizes. Specifically, in the source domain, we first jointly train different objectives of reinforcement and self-supervised learning, which enables to share the same encoder. The Fisher information matrix is then calculated to record the importance of the model parameters to the source domain. In the target domain, the encoder is used for the reinforcement learning task and is updated by self-supervised learning under the control of the Fisher regularizer. Our extensive experiments on four continuous control tasks from the DeepMind Control suite showed that in the absence of reward signals, the proposed SCRL can effectively adapt the model to the target domain without catastrophically forgetting the source domain.

Original language	English
Title of host publication	Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	55-63
Number of pages	9
ISBN (Electronic)	9798350312270
DOIs	https://doi.org/10.1109/AIoTSys58602.2023.00030
State	Published - 2023
Event	2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023 - Xi�an, China Duration: 19 Oct 2023 → 22 Oct 2023

Publication series

Name	Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023

Conference

Conference	2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023
Country/Territory	China
City	Xi�an
Period	19/10/23 → 22/10/23

Keywords

continual learning
domain adaptation
Fisher regularizer
reinforcement learning
self-supervised learning

Access to Document

10.1109/AIoTSys58602.2023.00030

Cite this

Fang, Y., Guo, B., Liu, J., Zhao, K., Ding, Y., Wang, N., & Yu, Z. (2023). SCRL: Self-supervised Continual Reinforcement Learning for Domain Adaptation. In Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023 (pp. 55-63). (Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/AIoTSys58602.2023.00030

Fang, Yuyang ; Guo, Bin ; Liu, Jiaqi et al. / SCRL : Self-supervised Continual Reinforcement Learning for Domain Adaptation. Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023. Institute of Electrical and Electronics Engineers Inc., 2023. pp. 55-63 (Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023).

@inproceedings{20289fba98bb484f89c02ab0528f32c6,

title = "SCRL: Self-supervised Continual Reinforcement Learning for Domain Adaptation",

abstract = "Deep reinforcement learning outperforms humans in many tasks. However, in reality, the scenarios of agent deployment often change, and the performance of models may therefore degrade, resulting in unreasonable decisions in new scenarios. Therefore, it is important for the model to own the evolution ability in new scenarios. Although recent attempts have studied the continual training of models through reward signals or self-supervised learning, on the one hand, previously used reward signals in the target domain are generally difficult to obtain and on the other hand, the performance of self-supervised evolution is always limited. In addition, adapting to the target domain usually results in forgetting the source domain. To overcome the above problems, in this paper, we propose a self-supervised continual reinforcement learning method, called SCRL, which combines reinforcement learning tasks with self-supervised learning auxiliary tasks and weight regularizes. Specifically, in the source domain, we first jointly train different objectives of reinforcement and self-supervised learning, which enables to share the same encoder. The Fisher information matrix is then calculated to record the importance of the model parameters to the source domain. In the target domain, the encoder is used for the reinforcement learning task and is updated by self-supervised learning under the control of the Fisher regularizer. Our extensive experiments on four continuous control tasks from the DeepMind Control suite showed that in the absence of reward signals, the proposed SCRL can effectively adapt the model to the target domain without catastrophically forgetting the source domain.",

keywords = "continual learning, domain adaptation, Fisher regularizer, reinforcement learning, self-supervised learning",

author = "Yuyang Fang and Bin Guo and Jiaqi Liu and Kaixing Zhao and Yasan Ding and Na Wang and Zhiwen Yu",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023 ; Conference date: 19-10-2023 Through 22-10-2023",

year = "2023",

doi = "10.1109/AIoTSys58602.2023.00030",

language = "英语",

series = "Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "55--63",

booktitle = "Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023",

}

Fang, Y, Guo, B, Liu, J, Zhao, K, Ding, Y, Wang, N & Yu, Z 2023, SCRL: Self-supervised Continual Reinforcement Learning for Domain Adaptation. in Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023. Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023, Institute of Electrical and Electronics Engineers Inc., pp. 55-63, 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023, Xi�an, China, 19/10/23. https://doi.org/10.1109/AIoTSys58602.2023.00030

SCRL: Self-supervised Continual Reinforcement Learning for Domain Adaptation. / Fang, Yuyang; Guo, Bin; Liu, Jiaqi et al.
Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023. Institute of Electrical and Electronics Engineers Inc., 2023. p. 55-63 (Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - SCRL

T2 - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023

AU - Fang, Yuyang

AU - Guo, Bin

AU - Liu, Jiaqi

AU - Zhao, Kaixing

AU - Ding, Yasan

AU - Wang, Na

AU - Yu, Zhiwen

PY - 2023

Y1 - 2023

N2 - Deep reinforcement learning outperforms humans in many tasks. However, in reality, the scenarios of agent deployment often change, and the performance of models may therefore degrade, resulting in unreasonable decisions in new scenarios. Therefore, it is important for the model to own the evolution ability in new scenarios. Although recent attempts have studied the continual training of models through reward signals or self-supervised learning, on the one hand, previously used reward signals in the target domain are generally difficult to obtain and on the other hand, the performance of self-supervised evolution is always limited. In addition, adapting to the target domain usually results in forgetting the source domain. To overcome the above problems, in this paper, we propose a self-supervised continual reinforcement learning method, called SCRL, which combines reinforcement learning tasks with self-supervised learning auxiliary tasks and weight regularizes. Specifically, in the source domain, we first jointly train different objectives of reinforcement and self-supervised learning, which enables to share the same encoder. The Fisher information matrix is then calculated to record the importance of the model parameters to the source domain. In the target domain, the encoder is used for the reinforcement learning task and is updated by self-supervised learning under the control of the Fisher regularizer. Our extensive experiments on four continuous control tasks from the DeepMind Control suite showed that in the absence of reward signals, the proposed SCRL can effectively adapt the model to the target domain without catastrophically forgetting the source domain.

AB - Deep reinforcement learning outperforms humans in many tasks. However, in reality, the scenarios of agent deployment often change, and the performance of models may therefore degrade, resulting in unreasonable decisions in new scenarios. Therefore, it is important for the model to own the evolution ability in new scenarios. Although recent attempts have studied the continual training of models through reward signals or self-supervised learning, on the one hand, previously used reward signals in the target domain are generally difficult to obtain and on the other hand, the performance of self-supervised evolution is always limited. In addition, adapting to the target domain usually results in forgetting the source domain. To overcome the above problems, in this paper, we propose a self-supervised continual reinforcement learning method, called SCRL, which combines reinforcement learning tasks with self-supervised learning auxiliary tasks and weight regularizes. Specifically, in the source domain, we first jointly train different objectives of reinforcement and self-supervised learning, which enables to share the same encoder. The Fisher information matrix is then calculated to record the importance of the model parameters to the source domain. In the target domain, the encoder is used for the reinforcement learning task and is updated by self-supervised learning under the control of the Fisher regularizer. Our extensive experiments on four continuous control tasks from the DeepMind Control suite showed that in the absence of reward signals, the proposed SCRL can effectively adapt the model to the target domain without catastrophically forgetting the source domain.

KW - continual learning

KW - domain adaptation

KW - Fisher regularizer

KW - reinforcement learning

KW - self-supervised learning

UR - http://www.scopus.com/inward/record.url?scp=85188249202&partnerID=8YFLogxK

U2 - 10.1109/AIoTSys58602.2023.00030

DO - 10.1109/AIoTSys58602.2023.00030

M3 - 会议稿件

AN - SCOPUS:85188249202

T3 - Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023

SP - 55

EP - 63

BT - Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 19 October 2023 through 22 October 2023

ER -

Fang Y, Guo B, Liu J, Zhao K, Ding Y, Wang N et al. SCRL: Self-supervised Continual Reinforcement Learning for Domain Adaptation. In Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023. Institute of Electrical and Electronics Engineers Inc. 2023. p. 55-63. (Proceedings - 2023 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2023). doi: 10.1109/AIoTSys58602.2023.00030

SCRL: Self-supervised Continual Reinforcement Learning for Domain Adaptation

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this