Instance-Aware Remote Sensing Image Captioning with Cross-Hierarchy Attention

Chengze Wang; Zhiyu Jiang; Yuan Yuan

doi:10.1109/IGARSS39084.2020.9323213

Instance-Aware Remote Sensing Image Captioning with Cross-Hierarchy Attention

Chengze Wang, Zhiyu Jiang, Yuan Yuan

光电与智能研究院

Northwestern Polytechnical University Xian

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

15 引用（Scopus）

摘要

The spatial attention is a straightforward approach to enhance the performance for remote sensing image captioning. However, conventional spatial attention approaches consider only the attention distribution on one fixed coarse grid, resulting in the semantics of tiny objects can be easily ignored or disturbed during the visual feature extraction. Worse still, the fixed semantic level of conventional spatial attention limits the image understanding in different levels and perspectives, which is critical for tackling the huge diversity in remote sensing images. To address these issues, we propose a remote sensing image caption generator with instance-awareness and cross-hierarchy attention. 1) The instances awareness is achieved by introducing a multi-level feature architecture that contains the visual information of multi-level instance-possible regions and their surroundings. 2) Moreover, based on this multi-level feature extraction, a cross-hierarchy attention mechanism is proposed to prompt the decoder to dynamically focus on different semantic hierarchies and instances at each time step. The experimental results on public datasets demonstrate the superiority of proposed approach over existing methods.

源语言	英语
主期刊名	2020 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2020 - Proceedings
出版商	Institute of Electrical and Electronics Engineers Inc.
页	980-983
页数	4
ISBN（电子版）	9781728163741
DOI	https://doi.org/10.1109/IGARSS39084.2020.9323213
出版状态	已出版 - 26 9月 2020
活动	2020 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2020 - Virtual, Waikoloa, 美国期限: 26 9月 2020 → 2 10月 2020

出版系列

姓名	International Geoscience and Remote Sensing Symposium (IGARSS)

会议

会议	2020 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2020
国家/地区	美国
市	Virtual, Waikoloa
时期	26/09/20 → 2/10/20

访问文件

10.1109/IGARSS39084.2020.9323213

其它文件与链接

链接到 Scopus 的出版物

引用此

Wang, C., Jiang, Z., & Yuan, Y. (2020). Instance-Aware Remote Sensing Image Captioning with Cross-Hierarchy Attention. 在 2020 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2020 - Proceedings (页码 980-983). 文章 9323213 (International Geoscience and Remote Sensing Symposium (IGARSS)). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IGARSS39084.2020.9323213

@inproceedings{ff019176343443d1a8f2a0c905354324,

title = "Instance-Aware Remote Sensing Image Captioning with Cross-Hierarchy Attention",

abstract = "The spatial attention is a straightforward approach to enhance the performance for remote sensing image captioning. However, conventional spatial attention approaches consider only the attention distribution on one fixed coarse grid, resulting in the semantics of tiny objects can be easily ignored or disturbed during the visual feature extraction. Worse still, the fixed semantic level of conventional spatial attention limits the image understanding in different levels and perspectives, which is critical for tackling the huge diversity in remote sensing images. To address these issues, we propose a remote sensing image caption generator with instance-awareness and cross-hierarchy attention. 1) The instances awareness is achieved by introducing a multi-level feature architecture that contains the visual information of multi-level instance-possible regions and their surroundings. 2) Moreover, based on this multi-level feature extraction, a cross-hierarchy attention mechanism is proposed to prompt the decoder to dynamically focus on different semantic hierarchies and instances at each time step. The experimental results on public datasets demonstrate the superiority of proposed approach over existing methods.",

keywords = "Remote sensing image captioning, semantic understanding, visual attention",

author = "Chengze Wang and Zhiyu Jiang and Yuan Yuan",

note = "Publisher Copyright: {\textcopyright} 2020 IEEE.; 2020 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2020 ; Conference date: 26-09-2020 Through 02-10-2020",

year = "2020",

month = sep,

day = "26",

doi = "10.1109/IGARSS39084.2020.9323213",

language = "英语",

series = "International Geoscience and Remote Sensing Symposium (IGARSS)",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "980--983",

booktitle = "2020 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2020 - Proceedings",

}

Wang, C, Jiang, Z & Yuan, Y 2020, Instance-Aware Remote Sensing Image Captioning with Cross-Hierarchy Attention. 在 2020 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2020 - Proceedings., 9323213, International Geoscience and Remote Sensing Symposium (IGARSS), Institute of Electrical and Electronics Engineers Inc., 页码 980-983, 2020 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2020, Virtual, Waikoloa, 美国, 26/09/20. https://doi.org/10.1109/IGARSS39084.2020.9323213

Instance-Aware Remote Sensing Image Captioning with Cross-Hierarchy Attention. / Wang, Chengze; Jiang, Zhiyu; Yuan, Yuan.
2020 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2020 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2020. 页码 980-983 9323213 (International Geoscience and Remote Sensing Symposium (IGARSS)).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Instance-Aware Remote Sensing Image Captioning with Cross-Hierarchy Attention

AU - Wang, Chengze

AU - Jiang, Zhiyu

AU - Yuan, Yuan

PY - 2020/9/26

Y1 - 2020/9/26

N2 - The spatial attention is a straightforward approach to enhance the performance for remote sensing image captioning. However, conventional spatial attention approaches consider only the attention distribution on one fixed coarse grid, resulting in the semantics of tiny objects can be easily ignored or disturbed during the visual feature extraction. Worse still, the fixed semantic level of conventional spatial attention limits the image understanding in different levels and perspectives, which is critical for tackling the huge diversity in remote sensing images. To address these issues, we propose a remote sensing image caption generator with instance-awareness and cross-hierarchy attention. 1) The instances awareness is achieved by introducing a multi-level feature architecture that contains the visual information of multi-level instance-possible regions and their surroundings. 2) Moreover, based on this multi-level feature extraction, a cross-hierarchy attention mechanism is proposed to prompt the decoder to dynamically focus on different semantic hierarchies and instances at each time step. The experimental results on public datasets demonstrate the superiority of proposed approach over existing methods.

AB - The spatial attention is a straightforward approach to enhance the performance for remote sensing image captioning. However, conventional spatial attention approaches consider only the attention distribution on one fixed coarse grid, resulting in the semantics of tiny objects can be easily ignored or disturbed during the visual feature extraction. Worse still, the fixed semantic level of conventional spatial attention limits the image understanding in different levels and perspectives, which is critical for tackling the huge diversity in remote sensing images. To address these issues, we propose a remote sensing image caption generator with instance-awareness and cross-hierarchy attention. 1) The instances awareness is achieved by introducing a multi-level feature architecture that contains the visual information of multi-level instance-possible regions and their surroundings. 2) Moreover, based on this multi-level feature extraction, a cross-hierarchy attention mechanism is proposed to prompt the decoder to dynamically focus on different semantic hierarchies and instances at each time step. The experimental results on public datasets demonstrate the superiority of proposed approach over existing methods.

KW - Remote sensing image captioning

KW - semantic understanding

KW - visual attention

UR - http://www.scopus.com/inward/record.url?scp=85101988600&partnerID=8YFLogxK

U2 - 10.1109/IGARSS39084.2020.9323213

DO - 10.1109/IGARSS39084.2020.9323213

M3 - 会议稿件

AN - SCOPUS:85101988600

T3 - International Geoscience and Remote Sensing Symposium (IGARSS)

SP - 980

EP - 983

BT - 2020 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2020 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2020 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2020

Y2 - 26 September 2020 through 2 October 2020

ER -

Wang C, Jiang Z, Yuan Y. Instance-Aware Remote Sensing Image Captioning with Cross-Hierarchy Attention. 在 2020 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2020 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2020. 页码 980-983. 9323213. (International Geoscience and Remote Sensing Symposium (IGARSS)). doi: 10.1109/IGARSS39084.2020.9323213

Instance-Aware Remote Sensing Image Captioning with Cross-Hierarchy Attention

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此