TY - GEN
T1 - Training Consistent Mixture-of-Experts-Based Prompt Generator for Continual Learning
AU - Lu, Yue
AU - Zhang, Shizhou
AU - Cheng, De
AU - Liang, Guoqiang
AU - Xing, Yinghui
AU - Wang, Nannan
AU - Zhang, Yanning
N1 - Publisher Copyright:
Copyright © 2025, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2025/4/11
Y1 - 2025/4/11
N2 - Visual prompt tuning-based continual learning (CL) methods have shown promising performance in exemplar-free scenarios, where their key component can be viewed as a prompt generator. Existing approaches generally rely on freezing old prompts, slow updating and task discrimination for prompt generators to preserve stability and minimize forgetting. In contrast, we introduce a novel approach that trains a consistent prompt generator to ensure stability during CL. Consistency means that for any instance from an old task, its corresponding instance-ware prompt generated by the prompt generator remains consistent even as the generator continually updates in a new task. This ensures that the representation of a specific instance remains stable across tasks and thereby prevents forgetting. We employ a mixture of experts (MoE) as the prompt generator, which contains a router and multiple experts. By deriving conditions sufficient to achieve the consistency for the MoE prompt generator, we demonstrate that: during training in a new task, if the router and experts update in the directions orthogonal to the subspaces spanned by old input features and gating vectors, respectively, the consistency can be theoretically guaranteed. To implement this orthogonality, we project parameter gradients to those orthogonal directions using the orthogonal projection matrices computed via the null space method. Extensive experiments on four class-incremental learning benchmarks validate the effectiveness and superiority of our approach.
AB - Visual prompt tuning-based continual learning (CL) methods have shown promising performance in exemplar-free scenarios, where their key component can be viewed as a prompt generator. Existing approaches generally rely on freezing old prompts, slow updating and task discrimination for prompt generators to preserve stability and minimize forgetting. In contrast, we introduce a novel approach that trains a consistent prompt generator to ensure stability during CL. Consistency means that for any instance from an old task, its corresponding instance-ware prompt generated by the prompt generator remains consistent even as the generator continually updates in a new task. This ensures that the representation of a specific instance remains stable across tasks and thereby prevents forgetting. We employ a mixture of experts (MoE) as the prompt generator, which contains a router and multiple experts. By deriving conditions sufficient to achieve the consistency for the MoE prompt generator, we demonstrate that: during training in a new task, if the router and experts update in the directions orthogonal to the subspaces spanned by old input features and gating vectors, respectively, the consistency can be theoretically guaranteed. To implement this orthogonality, we project parameter gradients to those orthogonal directions using the orthogonal projection matrices computed via the null space method. Extensive experiments on four class-incremental learning benchmarks validate the effectiveness and superiority of our approach.
UR - https://www.scopus.com/pages/publications/105003904146
U2 - 10.1609/aaai.v39i18.34082
DO - 10.1609/aaai.v39i18.34082
M3 - 会议稿件
AN - SCOPUS:105003904146
T3 - Proceedings of the AAAI Conference on Artificial Intelligence
SP - 18915
EP - 18923
BT - Special Track on AI Alignment
A2 - Walsh, Toby
A2 - Shah, Julie
A2 - Kolter, Zico
PB - Association for the Advancement of Artificial Intelligence
T2 - 39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025
Y2 - 25 February 2025 through 4 March 2025
ER -