面向图像自动语句标注的注意力反馈模型

Fan Lyu; Fuyuan Hu; Yanning Zhang; Zhenping Xia; Victor S. Sheng

doi:10.3724/SP.J.1089.2019.17505

面向图像自动语句标注的注意力反馈模型

Fan Lyu, Fuyuan Hu, Yanning Zhang, Zhenping Xia, Victor S. Sheng

计算机学院

科研成果: 期刊稿件 › 文章 › 同行评审

3 引用（Scopus）

摘要

The image captioning problem aims to let machine generate relevant sentence of a given image, which has been applied to the service robot. To improve the performance of image captioning effectively, some researchers propose to leverage the attention mechanism. However, the mechanism often suffers from distraction and sentence-disorder. In this paper, we propose an image captioning model based on a novel feed-back attention mechanism. In generating the corresponding language for a given image, the proposed model uses the attention feedback from the generated language. With the feedback, the attention heatmap of the original image will be revised, and the generated sentence will also be better. We evaluate the proposed method on three benchmark datasets, i.e., Flickr8k, Flickr30k and MSCOCO, and the experimental results show the superiority of the proposed method.

投稿的翻译标题	Feedback Attention Model for Image Captioning
源语言	繁体中文
页（从-至）	1122-1129
页数	8
期刊	Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics
卷	31
期	7
DOI	https://doi.org/10.3724/SP.J.1089.2019.17505
出版状态	已出版 - 1 7月 2019

关键词

Attention feedback
Attention mechanism
Image captioning

访问文件

10.3724/SP.J.1089.2019.17505

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{5556618020d349b99bb39c31aa38a78a,

title = "面向图像自动语句标注的注意力反馈模型",

abstract = "The image captioning problem aims to let machine generate relevant sentence of a given image, which has been applied to the service robot. To improve the performance of image captioning effectively, some researchers propose to leverage the attention mechanism. However, the mechanism often suffers from distraction and sentence-disorder. In this paper, we propose an image captioning model based on a novel feed-back attention mechanism. In generating the corresponding language for a given image, the proposed model uses the attention feedback from the generated language. With the feedback, the attention heatmap of the original image will be revised, and the generated sentence will also be better. We evaluate the proposed method on three benchmark datasets, i.e., Flickr8k, Flickr30k and MSCOCO, and the experimental results show the superiority of the proposed method.",

keywords = "Attention feedback, Attention mechanism, Image captioning",

author = "Fan Lyu and Fuyuan Hu and Yanning Zhang and Zhenping Xia and Sheng, {Victor S.}",

year = "2019",

month = jul,

day = "1",

doi = "10.3724/SP.J.1089.2019.17505",

language = "繁体中文",

volume = "31",

pages = "1122--1129",

journal = "Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics",

issn = "1003-9775",

publisher = "Institute of Computing Technology",

number = "7",

}

TY - JOUR

T1 - 面向图像自动语句标注的注意力反馈模型

AU - Lyu, Fan

AU - Hu, Fuyuan

AU - Zhang, Yanning

AU - Xia, Zhenping

AU - Sheng, Victor S.

PY - 2019/7/1

Y1 - 2019/7/1

N2 - The image captioning problem aims to let machine generate relevant sentence of a given image, which has been applied to the service robot. To improve the performance of image captioning effectively, some researchers propose to leverage the attention mechanism. However, the mechanism often suffers from distraction and sentence-disorder. In this paper, we propose an image captioning model based on a novel feed-back attention mechanism. In generating the corresponding language for a given image, the proposed model uses the attention feedback from the generated language. With the feedback, the attention heatmap of the original image will be revised, and the generated sentence will also be better. We evaluate the proposed method on three benchmark datasets, i.e., Flickr8k, Flickr30k and MSCOCO, and the experimental results show the superiority of the proposed method.

AB - The image captioning problem aims to let machine generate relevant sentence of a given image, which has been applied to the service robot. To improve the performance of image captioning effectively, some researchers propose to leverage the attention mechanism. However, the mechanism often suffers from distraction and sentence-disorder. In this paper, we propose an image captioning model based on a novel feed-back attention mechanism. In generating the corresponding language for a given image, the proposed model uses the attention feedback from the generated language. With the feedback, the attention heatmap of the original image will be revised, and the generated sentence will also be better. We evaluate the proposed method on three benchmark datasets, i.e., Flickr8k, Flickr30k and MSCOCO, and the experimental results show the superiority of the proposed method.

KW - Attention feedback

KW - Attention mechanism

KW - Image captioning

UR - http://www.scopus.com/inward/record.url?scp=85073154567&partnerID=8YFLogxK

U2 - 10.3724/SP.J.1089.2019.17505

DO - 10.3724/SP.J.1089.2019.17505

M3 - 文章

AN - SCOPUS:85073154567

SN - 1003-9775

VL - 31

SP - 1122

EP - 1129

JO - Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics

JF - Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics

IS - 7

ER -

面向图像自动语句标注的注意力反馈模型

摘要

关键词

访问文件

其它文件与链接

指纹

引用此