CAMCFormer: Cross-Attention and Multicorrelation Aided Transformer for Few-Shot Object Detection in Optical Remote Sensing Images

Lefan Wang, Shaohui Mei, Yi Wang, Jiawei Lian, Zonghao Han, Yan Feng

Research output: Contribution to journalArticlepeer-review

Abstract

Few-shot object detection (FSOD) enables the detection of novel-class objects in remote sensing images (RSIs) with limited labeled samples. Although convolutional neural networks (CNNs) are commonly used for this task, they suffer from two inherent constraints. First, their limited local receptive field fails to capture global context within a single image and the relational dependencies between query and support images. Second, an additional feature alignment mechanism is typically required to bridge the gap between query and support images. To address these challenges, this work introduces a novel cross-attention and multicorrelation aided transformer (CAMCFormer) FSOD framework tailored for global feature representation and multicorrelation modeling in complex and large-scale RSIs. Specifically, a long-distance cross-attention module (LDCAM) is devised to capture dependencies between distant elements across query and support images at each feature extraction layer. This module facilitates the exchange of contextual information between images, resulting in more comprehensive feature representations and eliminating the need for separate feature alignment and fusion modules. Multicorrelation aided heads (MAHs) are constructed to enhance detection performance further to model various relational aspects, i.e., channel-correlation detection head (CCDH), spatial-correlation detection head (SCDH), and cross-attention detection head (CADH). These aided heads contribute to more robust and accurate classification and localization. Comprehensive experiments have been conducted, demonstrating the superiority of the proposed framework compared to several state-of-the-art detectors, highlighting its potential as an effective solution for FSOD in remote sensing scenarios.

Original languageEnglish
Article number5613316
JournalIEEE Transactions on Geoscience and Remote Sensing
Volume63
DOIs
StatePublished - 2025

Keywords

  • Few-shot learning (FSL)
  • object detection
  • optical remote sensing images (RSIs)
  • transformer

Fingerprint

Dive into the research topics of 'CAMCFormer: Cross-Attention and Multicorrelation Aided Transformer for Few-Shot Object Detection in Optical Remote Sensing Images'. Together they form a unique fingerprint.

Cite this