Dual-Branch Task Residual Enhancement with Parameter-Free Attention for Zero-Shot Multi-label Image Recognition

Shizhou Zhang, Kairui Dang, De Cheng, Yinghui Xing, Qirui Wu, Dexuan Kong, Yanning Zhang

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Zero-shot multi-label image recognition involves the task of recognizing multi-label images while “zero” visual information has been input into the model during training. Recently, with the emergence of large pre-trained vision-language model, the visual and semantic features can be well aligned after being trained with billions of image-text pairs collected from the internet. In this paper, by utilizing the pre-trained CLIP model, we propose a dual-branch task residual enhancement with parameter-free attention module that enhances interaction of inter-modal information to tackle the problem of multi-label image recognition. The method employs a dual-branch structure, including global and local branches. The local branch mitigates global feature dominance, improving image content understanding ability of local regions. Our method shows superiority in zero-shot multi-label learning on VOC2007, MS-COCO, and NUS-WIDE datasets, surpassing the state-of-the-art methods. Additionally, it also has excellent performance in partial label settings. Code is available in the supplementary materials.

源语言英语
主期刊名Pattern Recognition - 27th International Conference, ICPR 2024, Proceedings
编辑Apostolos Antonacopoulos, Subhasis Chaudhuri, Rama Chellappa, Cheng-Lin Liu, Saumik Bhattacharya, Umapada Pal
出版商Springer Science and Business Media Deutschland GmbH
160-171
页数12
ISBN(印刷版)9783031783111
DOI
出版状态已出版 - 2025
活动27th International Conference on Pattern Recognition, ICPR 2024 - Kolkata, 印度
期限: 1 12月 20245 12月 2024

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
15322 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议27th International Conference on Pattern Recognition, ICPR 2024
国家/地区印度
Kolkata
时期1/12/245/12/24

指纹

探究 'Dual-Branch Task Residual Enhancement with Parameter-Free Attention for Zero-Shot Multi-label Image Recognition' 的科研主题。它们共同构成独一无二的指纹。

引用此