Abstract
Remote sensing image captioning (RSIC) aims to generate accurate and concise textual descriptions for remote sensing (RS) images. It plays a significant role in the analysis of Earth observation data. The success of vision-and-language pretraining (VLP) models provides the foundation for their transfer to the RSIC task. To reduce the cost of transferring VLP models to downstream tasks, numerous parameter-efficient transfer learning (PETL) techniques have been proposed. However, most of them focus on fine-tuning general-purpose foundation models without fully considering the unique characteristics of RS data. In this article, we introduce PE-RSIC, a novel PETL framework tailored for RSIC. Specifically, the framework builds on a pretrained BLIP-2 model while further designing a lightweight cross-modal RS adapter (CRS-Adapter) and a Class Prompt. During training, all parameters of the pretrained model remain frozen, and the newly added CRS-Adapter modules are updated to efficiently transfer vision-and-language knowledge from the natural domain to the RS domain. The Class Prompt is obtained by projecting the vision-encoded [CLS] token into the decoder, guiding the model to generate more accurate captions. This approach enables the model to capture critical RS class features that might be lost during the query decoding process, with only a minimal increase in parameters. Extensive experiments show that our PE-RSIC framework outperforms full fine-tuning while using only 5% of the trainable parameters.
| Original language | English |
|---|---|
| Article number | 5630512 |
| Journal | IEEE Transactions on Geoscience and Remote Sensing |
| Volume | 63 |
| DOIs | |
| State | Published - 2025 |
Keywords
- Image captioning
- parameter-efficient transfer learning (PETL)
- remote sensing (RS)
Fingerprint
Dive into the research topics of 'Parameter-Efficient Transfer Learning for Remote Sensing Image Captioning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver