跳到主要导航 跳到搜索 跳到主要内容

Sample-agnostic Adversarial Perturbation for Vision-Language Pre-training Models

  • Northwestern Polytechnical University Xian
  • Harbin Institute of Technology

科研成果: 书/报告/会议事项章节会议稿件同行评审

6 引用 (Scopus)

摘要

Recent studies on AI security have highlighted the vulnerability of Vision-Language Pre-training (VLP) models to subtle yet intentionally designed perturbations in images and texts. Investigating multimodal systems' robustness via adversarial attacks is crucial in this field. Most multimodal attacks are sample-specific, generating a unique perturbation for each sample to construct adversarial samples. To the best of our knowledge, it is the first work through multimodal decision boundaries to explore the creation of a universal, sample-agnostic perturbation that applies to any image. Initially, we explore strategies to move sample points beyond the decision boundaries of linear classifiers, refining the algorithm to ensure successful attacks under the top k accuracy metric. Based on this foundation, in visual-language tasks, we treat visual and textual modalities as reciprocal sample points and decision hyperplanes, guiding image embeddings to traverse text-constructed decision boundaries, and vice versa. This iterative process consistently refines a universal perturbation, ultimately identifying a singular direction within the input space which is exploitable to impair the retrieval performance of VLP models. The proposed algorithms support the creation of global perturbations or adversarial patches. Comprehensive experiments validate the effectiveness of our method, showcasing its data, task, and model transferability across various VLP models and datasets. Code: https://github.com/LibertazZ/MUAP

源语言英语
主期刊名MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
出版商Association for Computing Machinery, Inc
9749-9758
页数10
ISBN(电子版)9798400706868
DOI
出版状态已出版 - 28 10月 2024
活动32nd ACM International Conference on Multimedia, MM 2024 - Melbourne, 澳大利亚
期限: 28 10月 20241 11月 2024

出版系列

姓名MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia

会议

会议32nd ACM International Conference on Multimedia, MM 2024
国家/地区澳大利亚
Melbourne
时期28/10/241/11/24

指纹

探究 'Sample-agnostic Adversarial Perturbation for Vision-Language Pre-training Models' 的科研主题。它们共同构成独一无二的指纹。

引用此