TY - JOUR
T1 - VPT-NSP2++
T2 - Importance-Aware Visual Prompt Tuning in Null Space for Continual Learning
AU - Zhang, Shizhou
AU - Lu, Yue
AU - Cheng, De
AU - Xing, Yinghui
AU - Wang, Nannan
AU - Wang, Peng
AU - Zhang, Yanning
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2026
Y1 - 2026
N2 - Continual learning (CL) enables AI models to adapt to evolving environments while mitigating catastrophic forgetting, which is a critical capability for dynamic real-world applications. With the growing popularity of pre-trained Vision Transformer (ViT) models and visual prompt tuning (VPT) technique in CL, this work explores a CL method on top of the ViT-based foundation model, through VPT mechanism with theoretical guarantees. Inspired by the orthogonal projection method, we aim to leverage this approach for VPT to enhance CL performance, particularly in long-term scenarios. However, since the orthogonal projection is originally designed for linear operations in CNNs, applying it to ViTs poses challenges induced by the non-linear self-attention mechanism and the distribution drift within LayerNorm. To address these issues, we deduced two orthogonality conditions to achieve the prompt gradient orthogonal projection, which provide a theoretical guarantee of maintaining stability. Considering the strict orthogonal constraints can diminish model capacity and reduce plasticity, we further propose an importance-aware orthogonal regularization framework. By applying varying degrees of orthogonal constraints to different parameters based on their importance to old and new tasks, the framework adaptively enhances model capacity and thereby promotes long-sequence CL while improving the stability-plasticity trade-off. To implement the proposed approach, a null-space-based approximation solution is employed to efficiently achieve the prompt gradient orthogonal projection. Extensive experiments on various class-incremental learning benchmarks demonstrate that our method achieves state-of-the-art performance across diverse CL scenarios.
AB - Continual learning (CL) enables AI models to adapt to evolving environments while mitigating catastrophic forgetting, which is a critical capability for dynamic real-world applications. With the growing popularity of pre-trained Vision Transformer (ViT) models and visual prompt tuning (VPT) technique in CL, this work explores a CL method on top of the ViT-based foundation model, through VPT mechanism with theoretical guarantees. Inspired by the orthogonal projection method, we aim to leverage this approach for VPT to enhance CL performance, particularly in long-term scenarios. However, since the orthogonal projection is originally designed for linear operations in CNNs, applying it to ViTs poses challenges induced by the non-linear self-attention mechanism and the distribution drift within LayerNorm. To address these issues, we deduced two orthogonality conditions to achieve the prompt gradient orthogonal projection, which provide a theoretical guarantee of maintaining stability. Considering the strict orthogonal constraints can diminish model capacity and reduce plasticity, we further propose an importance-aware orthogonal regularization framework. By applying varying degrees of orthogonal constraints to different parameters based on their importance to old and new tasks, the framework adaptively enhances model capacity and thereby promotes long-sequence CL while improving the stability-plasticity trade-off. To implement the proposed approach, a null-space-based approximation solution is employed to efficiently achieve the prompt gradient orthogonal projection. Extensive experiments on various class-incremental learning benchmarks demonstrate that our method achieves state-of-the-art performance across diverse CL scenarios.
KW - Continual learning
KW - catastrophic forgetting
KW - null space
KW - orthogonal projection
KW - visual prompt tuning
UR - https://www.scopus.com/pages/publications/105024830202
U2 - 10.1109/TPAMI.2025.3642298
DO - 10.1109/TPAMI.2025.3642298
M3 - 文章
AN - SCOPUS:105024830202
SN - 0162-8828
VL - 48
SP - 4318
EP - 4335
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 4
ER -