Multi-scale cropping mechanism for remote sensing image captioning

Xueting Zhang, Qi Wang, Shangdong Chen, Xuelong Li

Research output: Contribution to conferencePaperpeer-review

42 Scopus citations

Abstract

With the rapid development of artificial satellite, a large number of high resolution remote sensing images can be easily obtained now. Recently, remote sensing image captioning, which aims to generate accurate and concise descriptive sentences for remote sensing images, has been promoted by template-based model and encoder-decoder model with several related datasets released. Based on an encoder-decoder model, we propose a training mechanism of multi-scale cropping for remote sensing image captioning in this paper, which can extract more fine-grained information from remote sensing images and enhance the generalization performance of the base model. The experimental results on two datasets UCM-captions and Sydney-captions demonstrate that the proposed approach availably improves the performances in describing high resolution remote sensing images.

Original languageEnglish
Pages10039-10042
Number of pages4
DOIs
StatePublished - 2019
Event39th IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2019 - Yokohama, Japan
Duration: 28 Jul 20192 Aug 2019

Conference

Conference39th IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2019
Country/TerritoryJapan
CityYokohama
Period28/07/192/08/19

Keywords

  • Encoder-decoder
  • Image captioning
  • Multi-scale cropping
  • Remote sensing image

Fingerprint

Dive into the research topics of 'Multi-scale cropping mechanism for remote sensing image captioning'. Together they form a unique fingerprint.

Cite this