Abstract
Existing generative attacks primarily learn from single-object scenarios, and they might fail to handle the intricate spatial and semantic relationships between multiple objects, which are common in real-world scenarios with dense occlusions and other complexities. Addressing these limitations, we propose Generative Attack in Complex Real-world Scenarios (GACRS), a novel method designed to enhance the transferability of adversarial examples. Primarily, our analysis indicates that existing method for utilizing the CLIP text branch is limited, mainly due to its random sampling strategies, which introduces sampling bias and restricts it to the scenarios with only two categories within a single scene. Thus, we propose a multi-object clustering-based text sampling method tailored for the CLIP text branch, thereby enhancing the diversity and relevance of text features and providing more meaningful guidance for generator optimization. In addition, to the best of our knowledge, we are the first to apply curriculum learning to the training process of generative attacks. This operation involves a dynamic input sample selection strategy that adapts to different training phases, enabling the generator to transition from simpler tasks to more complex tasks, thereby improving the generalization capability of adversarial perturbations. Extensive experiments across within-domain, cross-domain, and cross-task scenarios show that GACRS consistently outperforms existing peer methods. Codes will be released at https://github.com/phyyyy/GACRS.
Original language | English |
---|---|
Article number | 111893 |
Journal | Pattern Recognition |
Volume | 169 |
DOIs | |
State | Published - Jan 2026 |
Keywords
- Clustering
- Computer vision
- Curriculum learning
- Generative attack
- Pattern recognition