TY - JOUR
T1 - Task-aware Orthogonal Sparse Network for Exploring Shared Knowledge in Continual Learning
AU - Hu, Yusong
AU - Cheng, De
AU - Zhang, Dingwen
AU - Wang, Nannan
AU - Liu, Tongliang
AU - Gao, Xinbo
N1 - Publisher Copyright:
Copyright 2024 by the author(s)
PY - 2024
Y1 - 2024
N2 - Continual learning (CL) aims to learn from sequentially arriving tasks without catastrophic forgetting (CF). By partitioning the network into two parts based on the Lottery Ticket Hypothesis-one for holding the knowledge of the old tasks while the other for learning the knowledge of the new task-the recent progress has achieved forget-free CL. Although addressing the CF issue well, such methods would encounter serious under-fitting in long-term CL, in which the learning process will continue for a long time and the number of new tasks involved will be much higher. To solve this problem, this paper partitions the network into three parts-with a new part for exploring the knowledge sharing between the old and new tasks. With the shared knowledge, this part of network can be learnt to simultaneously consolidate the old tasks and fit to the new task. To achieve this goal, we propose a task-aware Orthogonal Sparse Network (OSN), which contains shared knowledge induced network partition and sharpness-aware orthogonal sparse network learning. The former partitions the network to select shared parameters, while the latter guides the exploration of shared knowledge through shared parameters. Qualitative and quantitative analyses, show that the proposed OSN induces minimum to no interference with past tasks, i.e., approximately no forgetting, while greatly improves the model plasticity and capacity, and finally achieves the state-of-the-art performances.
AB - Continual learning (CL) aims to learn from sequentially arriving tasks without catastrophic forgetting (CF). By partitioning the network into two parts based on the Lottery Ticket Hypothesis-one for holding the knowledge of the old tasks while the other for learning the knowledge of the new task-the recent progress has achieved forget-free CL. Although addressing the CF issue well, such methods would encounter serious under-fitting in long-term CL, in which the learning process will continue for a long time and the number of new tasks involved will be much higher. To solve this problem, this paper partitions the network into three parts-with a new part for exploring the knowledge sharing between the old and new tasks. With the shared knowledge, this part of network can be learnt to simultaneously consolidate the old tasks and fit to the new task. To achieve this goal, we propose a task-aware Orthogonal Sparse Network (OSN), which contains shared knowledge induced network partition and sharpness-aware orthogonal sparse network learning. The former partitions the network to select shared parameters, while the latter guides the exploration of shared knowledge through shared parameters. Qualitative and quantitative analyses, show that the proposed OSN induces minimum to no interference with past tasks, i.e., approximately no forgetting, while greatly improves the model plasticity and capacity, and finally achieves the state-of-the-art performances.
UR - http://www.scopus.com/inward/record.url?scp=85203846634&partnerID=8YFLogxK
M3 - 会议文章
AN - SCOPUS:85203846634
SN - 2640-3498
VL - 235
SP - 19153
EP - 19164
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 41st International Conference on Machine Learning, ICML 2024
Y2 - 21 July 2024 through 27 July 2024
ER -