Transformer-Based Few-Shot Object Detection with Multi-Relation Matching for Remote Sensing Images

Lefan Wang; Jiawei Lian; Yan Feng; Xiaoning Chen; Shaohui Mei

doi:10.1109/IGARSS53475.2024.10642409

Transformer-Based Few-Shot Object Detection with Multi-Relation Matching for Remote Sensing Images

Lefan Wang, Jiawei Lian, Yan Feng, Xiaoning Chen, Shaohui Mei

School of Electronics and Information

Northwestern Polytechnical University Xian

Research output: Contribution to conference › Paper › peer-review

1 Scopus citations

Abstract

Few-shot object detection (FSOD) on remote sensing images (RSIs) has garnered significant research interest due to its ability to detect novel classes using very few training examples from challenging remote sensing scenarios. Meta-learning FSOD methods, based on Faster R-CNN and YOLO structures, utilize a two-branch Siamese network as the backbone and compute the similarity between image regions for effective detection. However, almost all methods rely on extracting features using convolutional neural networks (CNNs). Inspired by the improved performance of transformer backbones for downstream tasks, a transformer-based FSOD method is proposed, which employs a transformer backbone with asymmetric-batched cross-attention for the two-branch feature extraction. Our model can improve the classification performance by introducing a Multi-Relation Matching (MRM) head for FSOD to enhance the similarity relation matching learning between two branches. Comprehensive experiments on DIOR benchmarks demonstrate the effectiveness of our model.

Original language	English
Pages	8046-8049
Number of pages	4
DOIs	https://doi.org/10.1109/IGARSS53475.2024.10642409
State	Published - 2024
Event	2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024 - Athens, Greece Duration: 7 Jul 2024 → 12 Jul 2024

Conference

Conference	2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024
Country/Territory	Greece
City	Athens
Period	7/07/24 → 12/07/24

Keywords

few-shot learning
object detection
remote sensing images
Transformer

Access to Document

10.1109/IGARSS53475.2024.10642409

Cite this

@conference{4660ca544e2e475badbe17a0ea246188,

title = "Transformer-Based Few-Shot Object Detection with Multi-Relation Matching for Remote Sensing Images",

abstract = "Few-shot object detection (FSOD) on remote sensing images (RSIs) has garnered significant research interest due to its ability to detect novel classes using very few training examples from challenging remote sensing scenarios. Meta-learning FSOD methods, based on Faster R-CNN and YOLO structures, utilize a two-branch Siamese network as the backbone and compute the similarity between image regions for effective detection. However, almost all methods rely on extracting features using convolutional neural networks (CNNs). Inspired by the improved performance of transformer backbones for downstream tasks, a transformer-based FSOD method is proposed, which employs a transformer backbone with asymmetric-batched cross-attention for the two-branch feature extraction. Our model can improve the classification performance by introducing a Multi-Relation Matching (MRM) head for FSOD to enhance the similarity relation matching learning between two branches. Comprehensive experiments on DIOR benchmarks demonstrate the effectiveness of our model.",

keywords = "few-shot learning, object detection, remote sensing images, Transformer",

author = "Lefan Wang and Jiawei Lian and Yan Feng and Xiaoning Chen and Shaohui Mei",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024 ; Conference date: 07-07-2024 Through 12-07-2024",

year = "2024",

doi = "10.1109/IGARSS53475.2024.10642409",

language = "英语",

pages = "8046--8049",

}

TY - CONF

T1 - Transformer-Based Few-Shot Object Detection with Multi-Relation Matching for Remote Sensing Images

AU - Wang, Lefan

AU - Lian, Jiawei

AU - Feng, Yan

AU - Chen, Xiaoning

AU - Mei, Shaohui

PY - 2024

Y1 - 2024

N2 - Few-shot object detection (FSOD) on remote sensing images (RSIs) has garnered significant research interest due to its ability to detect novel classes using very few training examples from challenging remote sensing scenarios. Meta-learning FSOD methods, based on Faster R-CNN and YOLO structures, utilize a two-branch Siamese network as the backbone and compute the similarity between image regions for effective detection. However, almost all methods rely on extracting features using convolutional neural networks (CNNs). Inspired by the improved performance of transformer backbones for downstream tasks, a transformer-based FSOD method is proposed, which employs a transformer backbone with asymmetric-batched cross-attention for the two-branch feature extraction. Our model can improve the classification performance by introducing a Multi-Relation Matching (MRM) head for FSOD to enhance the similarity relation matching learning between two branches. Comprehensive experiments on DIOR benchmarks demonstrate the effectiveness of our model.

AB - Few-shot object detection (FSOD) on remote sensing images (RSIs) has garnered significant research interest due to its ability to detect novel classes using very few training examples from challenging remote sensing scenarios. Meta-learning FSOD methods, based on Faster R-CNN and YOLO structures, utilize a two-branch Siamese network as the backbone and compute the similarity between image regions for effective detection. However, almost all methods rely on extracting features using convolutional neural networks (CNNs). Inspired by the improved performance of transformer backbones for downstream tasks, a transformer-based FSOD method is proposed, which employs a transformer backbone with asymmetric-batched cross-attention for the two-branch feature extraction. Our model can improve the classification performance by introducing a Multi-Relation Matching (MRM) head for FSOD to enhance the similarity relation matching learning between two branches. Comprehensive experiments on DIOR benchmarks demonstrate the effectiveness of our model.

KW - few-shot learning

KW - object detection

KW - remote sensing images

KW - Transformer

UR - http://www.scopus.com/inward/record.url?scp=85208644704&partnerID=8YFLogxK

U2 - 10.1109/IGARSS53475.2024.10642409

DO - 10.1109/IGARSS53475.2024.10642409

M3 - 论文

AN - SCOPUS:85208644704

SP - 8046

EP - 8049

T2 - 2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024

Y2 - 7 July 2024 through 12 July 2024

ER -

Transformer-Based Few-Shot Object Detection with Multi-Relation Matching for Remote Sensing Images

Abstract

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this