TY - JOUR
T1 - Text-guided Distribution Calibration for Few-shot Object Detection in Remote Sensing Images
AU - Cao, Yu
AU - Chen, Jingyi
AU - Wang, Haoyu
AU - Zhang, Lei
AU - Ding, Chen
AU - Wei, Wei
AU - Cao, Shiqi
AU - Xie, Meilin
N1 - Publisher Copyright:
© 2008-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - In recent years, few-shot object detection(FSOD) in remote sensing images(RSIs) has received increasing attention. However, due to the large difference in the number of labeled samples between the base classes and the novel classes, using only visual information for object detection will cause the features learned by the model to be biased towards the base classes, resulting in poor generalization ability in the novel classes with scarce labeled samples. In this paper, we propose a Text-guided distribution calibration network for few-shot object detection on RSIs. Considering the limited visual information of the novel classes, we propose a cross-modal knowledge transfer strategy, which aims to extract the corresponding text feature of the object class name through a multi-modal pre-training model CLIP and transfer the text knowledge to the FSOD model, to mitigate the feature bias problem. Following this idea, we design a text-guided distribution calibration module(TDCM), for each query image which utilizes the intra-image object class distribution defined by the text features to calibrate the object class distribution computed based on the visual features using a knowledge distillation loss for model training. By doing this, the cross-class transferable text knowledge can be transferred to regularize the learned visual features step siding bias on base classes and thus improve the generalization capacity. We conducted experiments on the NWPU VHR-10 and DIOR datasets and clarified the superior performance of the proposed method compared with several state-of-the-art comparison methods.
AB - In recent years, few-shot object detection(FSOD) in remote sensing images(RSIs) has received increasing attention. However, due to the large difference in the number of labeled samples between the base classes and the novel classes, using only visual information for object detection will cause the features learned by the model to be biased towards the base classes, resulting in poor generalization ability in the novel classes with scarce labeled samples. In this paper, we propose a Text-guided distribution calibration network for few-shot object detection on RSIs. Considering the limited visual information of the novel classes, we propose a cross-modal knowledge transfer strategy, which aims to extract the corresponding text feature of the object class name through a multi-modal pre-training model CLIP and transfer the text knowledge to the FSOD model, to mitigate the feature bias problem. Following this idea, we design a text-guided distribution calibration module(TDCM), for each query image which utilizes the intra-image object class distribution defined by the text features to calibrate the object class distribution computed based on the visual features using a knowledge distillation loss for model training. By doing this, the cross-class transferable text knowledge can be transferred to regularize the learned visual features step siding bias on base classes and thus improve the generalization capacity. We conducted experiments on the NWPU VHR-10 and DIOR datasets and clarified the superior performance of the proposed method compared with several state-of-the-art comparison methods.
KW - distribution calibration
KW - few-shot object detection(FSOD)
KW - transfer-learning
KW - vision-language models
UR - http://www.scopus.com/inward/record.url?scp=105009387349&partnerID=8YFLogxK
U2 - 10.1109/JSTARS.2025.3582838
DO - 10.1109/JSTARS.2025.3582838
M3 - 文章
AN - SCOPUS:105009387349
SN - 1939-1404
JO - IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
JF - IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
ER -