Transformer-Based Few-Shot Object Detection with Multi-Relation Matching for Remote Sensing Images

Lefan Wang, Jiawei Lian, Yan Feng, Xiaoning Chen, Shaohui Mei

Research output: Contribution to conferencePaperpeer-review

1 Scopus citations

Abstract

Few-shot object detection (FSOD) on remote sensing images (RSIs) has garnered significant research interest due to its ability to detect novel classes using very few training examples from challenging remote sensing scenarios. Meta-learning FSOD methods, based on Faster R-CNN and YOLO structures, utilize a two-branch Siamese network as the backbone and compute the similarity between image regions for effective detection. However, almost all methods rely on extracting features using convolutional neural networks (CNNs). Inspired by the improved performance of transformer backbones for downstream tasks, a transformer-based FSOD method is proposed, which employs a transformer backbone with asymmetric-batched cross-attention for the two-branch feature extraction. Our model can improve the classification performance by introducing a Multi-Relation Matching (MRM) head for FSOD to enhance the similarity relation matching learning between two branches. Comprehensive experiments on DIOR benchmarks demonstrate the effectiveness of our model.

Original languageEnglish
Pages8046-8049
Number of pages4
DOIs
StatePublished - 2024
Event2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024 - Athens, Greece
Duration: 7 Jul 202412 Jul 2024

Conference

Conference2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024
Country/TerritoryGreece
CityAthens
Period7/07/2412/07/24

Keywords

  • few-shot learning
  • object detection
  • remote sensing images
  • Transformer

Fingerprint

Dive into the research topics of 'Transformer-Based Few-Shot Object Detection with Multi-Relation Matching for Remote Sensing Images'. Together they form a unique fingerprint.

Cite this