TY - JOUR
T1 - Hyper adversarial tuning for boosting adversarial robustness of pretrained large vision transformers
AU - Lv, Kangtao
AU - Fan, Wenyan
AU - Cao, Huangsen
AU - Tu, Kainan
AU - Xu, Yihuai
AU - Zhang, Zhimeng
AU - Li, Yang
AU - Ding, Xin
AU - Wang, Yongwei
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2026/3
Y1 - 2026/3
N2 - Large vision Transformers (ViTs) have achieved competitive performance in various computer vision tasks based on large-scale pre-training. However, large ViTs still remain vulnerable to adversarial examples, emphasizing the necessity of enhancing their adversarial robustness. While adversarial training is an effective defense for deep convolutional models, it often faces scalability issues with large ViTs due to high computational costs. Recent approaches propose robust fine-tuning methods, such as adversarial tuning of low-rank adaptation (LoRA) in ViT, however, they still struggle to match the accuracy of full parameter adversarial fine-tuning. An effective synergy of various defense mechanisms offers a promising approach to enhancing the robustness of ViT, yet this paradigm remains largely underexplored. To address this, we propose hyper adversarial tuning (HyperAT), a meta learning approach, which captures shared defensive knowledge among different methods to improve model robustness efficiently and effectively simultaneously. Specifically, adversarial tuning of each defense method is formulated as a learning task, and a HyperNetwork generates LoRA specific to this defense. Then, a random sampling and tuning strategy is proposed to extract and facilitate the defensive knowledge transfer between different defenses. Finally, diverse LoRAs are merged adaptively to further enhance the adversarial robustness. Experiments on various datasets and model architectures demonstrate that HyperAT significantly enhances the adversarial robustness of pretrained large vision models without excessive computational overhead, establishing a new state-of-the-art benchmark.
AB - Large vision Transformers (ViTs) have achieved competitive performance in various computer vision tasks based on large-scale pre-training. However, large ViTs still remain vulnerable to adversarial examples, emphasizing the necessity of enhancing their adversarial robustness. While adversarial training is an effective defense for deep convolutional models, it often faces scalability issues with large ViTs due to high computational costs. Recent approaches propose robust fine-tuning methods, such as adversarial tuning of low-rank adaptation (LoRA) in ViT, however, they still struggle to match the accuracy of full parameter adversarial fine-tuning. An effective synergy of various defense mechanisms offers a promising approach to enhancing the robustness of ViT, yet this paradigm remains largely underexplored. To address this, we propose hyper adversarial tuning (HyperAT), a meta learning approach, which captures shared defensive knowledge among different methods to improve model robustness efficiently and effectively simultaneously. Specifically, adversarial tuning of each defense method is formulated as a learning task, and a HyperNetwork generates LoRA specific to this defense. Then, a random sampling and tuning strategy is proposed to extract and facilitate the defensive knowledge transfer between different defenses. Finally, diverse LoRAs are merged adaptively to further enhance the adversarial robustness. Experiments on various datasets and model architectures demonstrate that HyperAT significantly enhances the adversarial robustness of pretrained large vision models without excessive computational overhead, establishing a new state-of-the-art benchmark.
KW - Adversarial robustness
KW - Adversarial tuning
KW - Hypernetwork
KW - Model merging
KW - Robust LoRa
UR - https://www.scopus.com/pages/publications/105011414453
U2 - 10.1016/j.patcog.2025.112158
DO - 10.1016/j.patcog.2025.112158
M3 - 文章
AN - SCOPUS:105011414453
SN - 0031-3203
VL - 171
JO - Pattern Recognition
JF - Pattern Recognition
M1 - 112158
ER -