Visual–Semantic Fuzzy Interaction Network for Zero-Shot Learning

Xuemeng Hui, Zhunga Liu, Jiaxiang Liu, Zuowei Zhang, Longfei Wang

Research output: Contribution to journalArticlepeer-review

Abstract

Zero-shot learning (ZSL) aims to recognize unseen class image objects using manually defined semantic knowledge corresponding to both seen and unseen images. The key of ZSL lies in building the interaction between precise image data and fuzzy semantic knowledge. The fuzziness is attributed to the difficulty in quantifying human knowledge. However, the existing ZSL methods ignore the inherent fuzziness of semantic knowledge and treat it as precise data during building the visual–semantic interaction. This is not good for transferring semantic knowledge from seen classes to unseen classes. In order to solve this problem, we propose a visual–semantic fuzzy interaction network (VSFIN) for ZSL. VSFIN utilize an effective encoder–decoder structure, including a semantic prototype encoder (SPE) and visual feature decoder (VFD). The SPE and VFD enable the visual features to interact with semantic knowledge via cross-attention. To achieve visual–semantic fuzzy interaction in SPE and VFD, we introduce the concept of membership function in fuzzy set theory and design a membership loss function. This loss function allows for a certain degree of imprecision in visual–semantic interaction, thereby enabling VSFIN to becomingly utilize the given semantic knowledge. Moreover, we introduce the concept of rank sum test and propose a distribution alignment loss to alleviate the bias towards seen classes. Extensive experiments on three widely used benchmarks have demonstrated that VSFIN outperforms current state-of-the-art methods under both conventional ZSL (CZSL) and generalized ZSL (GZSL) settings.

Original languageEnglish
Pages (from-to)1345-1359
Number of pages15
JournalIEEE Transactions on Artificial Intelligence
Volume6
Issue number5
DOIs
StatePublished - 2025

Keywords

  • Fuzzy set theory
  • knowledge transfer
  • membership function
  • object recognition
  • zero-shot learning

Fingerprint

Dive into the research topics of 'Visual–Semantic Fuzzy Interaction Network for Zero-Shot Learning'. Together they form a unique fingerprint.

Cite this