Learning to Compare Relation: Semantic Alignment for Few-Shot Learning

Congqi Cao; Yanning Zhang

doi:10.1109/TIP.2022.3142530

Learning to Compare Relation: Semantic Alignment for Few-Shot Learning

Congqi Cao, Yanning Zhang

School of Computer Science

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

32 Scopus citations

Abstract

Few-shot learning is a fundamental and challenging problem since it requires recognizing novel categories from only a few examples. The objects for recognition have multiple variants and can locate anywhere in images. Directly comparing query images with example images can not handle content misalignment. The representation and metric for comparison are critical but challenging to learn due to the scarcity and wide variation of the samples in few-shot learning. In this paper, we present a novel semantic alignment model to compare relations, which is robust to content misalignment. We propose to add two key ingredients to existing few-shot learning frameworks for better feature and metric learning ability. First, we introduce a semantic alignment loss to align the relation statistics of the features from samples that belong to the same category. And second, local and global mutual information maximization is introduced, allowing for representations that contain locally-consistent and intra-class shared information across structural locations in an image. Furthermore, we introduce a principled approach to weigh multiple loss functions by considering the homoscedastic uncertainty of each stream. We conduct extensive experiments on several few-shot learning datasets. Experimental results show that the proposed method is capable of comparing relations with semantic alignment strategies, and achieves state-of-the-art performance.

Original language	English
Pages (from-to)	1462-1474
Number of pages	13
Journal	IEEE Transactions on Image Processing
Volume	31
DOIs	https://doi.org/10.1109/TIP.2022.3142530
State	Published - 2022

Keywords

Feature extraction
Measurement
Mutual information
Semantics
Streaming media
Task analysis
Uncertainty

Access to Document

10.1109/TIP.2022.3142530

Cite this

@article{71f3a051bf45422b9028f0ab862854aa,

title = "Learning to Compare Relation: Semantic Alignment for Few-Shot Learning",

abstract = "Few-shot learning is a fundamental and challenging problem since it requires recognizing novel categories from only a few examples. The objects for recognition have multiple variants and can locate anywhere in images. Directly comparing query images with example images can not handle content misalignment. The representation and metric for comparison are critical but challenging to learn due to the scarcity and wide variation of the samples in few-shot learning. In this paper, we present a novel semantic alignment model to compare relations, which is robust to content misalignment. We propose to add two key ingredients to existing few-shot learning frameworks for better feature and metric learning ability. First, we introduce a semantic alignment loss to align the relation statistics of the features from samples that belong to the same category. And second, local and global mutual information maximization is introduced, allowing for representations that contain locally-consistent and intra-class shared information across structural locations in an image. Furthermore, we introduce a principled approach to weigh multiple loss functions by considering the homoscedastic uncertainty of each stream. We conduct extensive experiments on several few-shot learning datasets. Experimental results show that the proposed method is capable of comparing relations with semantic alignment strategies, and achieves state-of-the-art performance.",

keywords = "Feature extraction, Measurement, Mutual information, Semantics, Streaming media, Task analysis, Uncertainty",

author = "Congqi Cao and Yanning Zhang",

note = "Publisher Copyright: 1941-0042 {\textcopyright} 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.",

year = "2022",

doi = "10.1109/TIP.2022.3142530",

language = "英语",

volume = "31",

pages = "1462--1474",

journal = "IEEE Transactions on Image Processing",

issn = "1057-7149",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Learning to Compare Relation

T2 - Semantic Alignment for Few-Shot Learning

AU - Cao, Congqi

AU - Zhang, Yanning

N1 - Publisher Copyright: 1941-0042 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

PY - 2022

Y1 - 2022

N2 - Few-shot learning is a fundamental and challenging problem since it requires recognizing novel categories from only a few examples. The objects for recognition have multiple variants and can locate anywhere in images. Directly comparing query images with example images can not handle content misalignment. The representation and metric for comparison are critical but challenging to learn due to the scarcity and wide variation of the samples in few-shot learning. In this paper, we present a novel semantic alignment model to compare relations, which is robust to content misalignment. We propose to add two key ingredients to existing few-shot learning frameworks for better feature and metric learning ability. First, we introduce a semantic alignment loss to align the relation statistics of the features from samples that belong to the same category. And second, local and global mutual information maximization is introduced, allowing for representations that contain locally-consistent and intra-class shared information across structural locations in an image. Furthermore, we introduce a principled approach to weigh multiple loss functions by considering the homoscedastic uncertainty of each stream. We conduct extensive experiments on several few-shot learning datasets. Experimental results show that the proposed method is capable of comparing relations with semantic alignment strategies, and achieves state-of-the-art performance.

AB - Few-shot learning is a fundamental and challenging problem since it requires recognizing novel categories from only a few examples. The objects for recognition have multiple variants and can locate anywhere in images. Directly comparing query images with example images can not handle content misalignment. The representation and metric for comparison are critical but challenging to learn due to the scarcity and wide variation of the samples in few-shot learning. In this paper, we present a novel semantic alignment model to compare relations, which is robust to content misalignment. We propose to add two key ingredients to existing few-shot learning frameworks for better feature and metric learning ability. First, we introduce a semantic alignment loss to align the relation statistics of the features from samples that belong to the same category. And second, local and global mutual information maximization is introduced, allowing for representations that contain locally-consistent and intra-class shared information across structural locations in an image. Furthermore, we introduce a principled approach to weigh multiple loss functions by considering the homoscedastic uncertainty of each stream. We conduct extensive experiments on several few-shot learning datasets. Experimental results show that the proposed method is capable of comparing relations with semantic alignment strategies, and achieves state-of-the-art performance.

KW - Feature extraction

KW - Measurement

KW - Mutual information

KW - Semantics

KW - Streaming media

KW - Task analysis

KW - Uncertainty

UR - http://www.scopus.com/inward/record.url?scp=85123342652&partnerID=8YFLogxK

U2 - 10.1109/TIP.2022.3142530

DO - 10.1109/TIP.2022.3142530

M3 - 文章

C2 - 35044916

AN - SCOPUS:85123342652

SN - 1057-7149

VL - 31

SP - 1462

EP - 1474

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

ER -

Learning to Compare Relation: Semantic Alignment for Few-Shot Learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this