Text-guided Distribution Calibration for Few-shot Object Detection in Remote Sensing Images

Yu Cao, Jingyi Chen, Haoyu Wang, Lei Zhang, Chen Ding, Wei Wei, Shiqi Cao, Meilin Xie

Research output: Contribution to journalArticlepeer-review

Abstract

In recent years, few-shot object detection(FSOD) in remote sensing images(RSIs) has received increasing attention. However, due to the large difference in the number of labeled samples between the base classes and the novel classes, using only visual information for object detection will cause the features learned by the model to be biased towards the base classes, resulting in poor generalization ability in the novel classes with scarce labeled samples. In this paper, we propose a Text-guided distribution calibration network for few-shot object detection on RSIs. Considering the limited visual information of the novel classes, we propose a cross-modal knowledge transfer strategy, which aims to extract the corresponding text feature of the object class name through a multi-modal pre-training model CLIP and transfer the text knowledge to the FSOD model, to mitigate the feature bias problem. Following this idea, we design a text-guided distribution calibration module(TDCM), for each query image which utilizes the intra-image object class distribution defined by the text features to calibrate the object class distribution computed based on the visual features using a knowledge distillation loss for model training. By doing this, the cross-class transferable text knowledge can be transferred to regularize the learned visual features step siding bias on base classes and thus improve the generalization capacity. We conducted experiments on the NWPU VHR-10 and DIOR datasets and clarified the superior performance of the proposed method compared with several state-of-the-art comparison methods.

Keywords

  • distribution calibration
  • few-shot object detection(FSOD)
  • transfer-learning
  • vision-language models

Cite this