TY - JOUR
T1 - Like Humans to Few-Shot Learning Through Knowledge Permeation of Visual and Language
AU - Jia, Yuyu
AU - Zhou, Qing
AU - Gao, Junyu
AU - Li, Qiang
AU - Wang, Qi
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Few-shot learning aims to generalize the recognizer from seen categories to an entirely novel scenario. With only a few support samples, several advanced methods initially introduce class names as prior knowledge for identifying novel classes. However, obstacles still impede achieving a comprehensive understanding of how to harness the mutual advantages of visual and textual knowledge. In this paper, we set out to fill this gap via a coherent Bidirectional Knowledge Permeation strategy called BiKop, which is grounded in human intuition: a class name description offers a more general representation, whereas an image captures the specificity of individuals. BiKop primarily establishes a hierarchical joint general-specific representation through bidirectional knowledge permeation. On the other hand, considering the bias of joint representation towards the base set, we disentangle base-class-relevant semantics during training, thereby alleviating the suppression of potential novel-class-relevant information. Experiments on four challenging benchmarks demonstrate the remarkable superiority of BiKop, particularly outperforming previous methods by a substantial margin in the 1-shot setting (improving the accuracy by 7.58% on miniImageNet).
AB - Few-shot learning aims to generalize the recognizer from seen categories to an entirely novel scenario. With only a few support samples, several advanced methods initially introduce class names as prior knowledge for identifying novel classes. However, obstacles still impede achieving a comprehensive understanding of how to harness the mutual advantages of visual and textual knowledge. In this paper, we set out to fill this gap via a coherent Bidirectional Knowledge Permeation strategy called BiKop, which is grounded in human intuition: a class name description offers a more general representation, whereas an image captures the specificity of individuals. BiKop primarily establishes a hierarchical joint general-specific representation through bidirectional knowledge permeation. On the other hand, considering the bias of joint representation towards the base set, we disentangle base-class-relevant semantics during training, thereby alleviating the suppression of potential novel-class-relevant information. Experiments on four challenging benchmarks demonstrate the remarkable superiority of BiKop, particularly outperforming previous methods by a substantial margin in the 1-shot setting (improving the accuracy by 7.58% on miniImageNet).
KW - Few-shot learning
KW - class-relevant information
KW - knowledge disparity
UR - https://www.scopus.com/pages/publications/105015474525
U2 - 10.1109/TMM.2025.3604977
DO - 10.1109/TMM.2025.3604977
M3 - 文章
AN - SCOPUS:105015474525
SN - 1520-9210
VL - 27
SP - 7905
EP - 7916
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -