Abstract
With the rapid development of artificial satellite, a large number of high resolution remote sensing images can be easily obtained now. Recently, remote sensing image captioning, which aims to generate accurate and concise descriptive sentences for remote sensing images, has been promoted by template-based model and encoder-decoder model with several related datasets released. Based on an encoder-decoder model, we propose a training mechanism of multi-scale cropping for remote sensing image captioning in this paper, which can extract more fine-grained information from remote sensing images and enhance the generalization performance of the base model. The experimental results on two datasets UCM-captions and Sydney-captions demonstrate that the proposed approach availably improves the performances in describing high resolution remote sensing images.
Original language | English |
---|---|
Pages | 10039-10042 |
Number of pages | 4 |
DOIs | |
State | Published - 2019 |
Event | 39th IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2019 - Yokohama, Japan Duration: 28 Jul 2019 → 2 Aug 2019 |
Conference
Conference | 39th IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2019 |
---|---|
Country/Territory | Japan |
City | Yokohama |
Period | 28/07/19 → 2/08/19 |
Keywords
- Encoder-decoder
- Image captioning
- Multi-scale cropping
- Remote sensing image