A holistic representation guided attention network for scene text recognition

Lu Yang, Peng Wang, Hui Li, Zhen Li, Yanning Zhang

科研成果: 期刊稿件文章同行评审

59 引用 (Scopus)

摘要

Reading irregular scene text of arbitrary shape in natural images is still a challenging problem, despite the progress made recently. Many existing approaches incorporate sophisticated network structures to handle various shapes, use extra annotations for stronger supervision, or employ hard-to-train recurrent neural networks for sequence modeling. In this work, we propose a simple yet strong approach for scene text recognition. With no need to convert input images to sequence representations, we directly connect two-dimensional CNN features to an attention-based sequence decoder which guided by holistic representation. The holistic representation can guide the attention-based decoder focus on more accurate area. As no recurrent module is adopted, our model can be trained in parallel. It achieves 1.5× to 9.4× acceleration to backward pass and 1.3× to 7.9× acceleration to forward pass, compared with the RNN counterparts. The proposed model is trained with only word-level annotations. With this simple design, our method achieves state-of-the-art or competitive recognition performance on the evaluated regular and irregular scene text benchmark datasets.

源语言英语
页(从-至)67-75
页数9
期刊Neurocomputing
414
DOI
出版状态已出版 - 13 11月 2020

指纹

探究 'A holistic representation guided attention network for scene text recognition' 的科研主题。它们共同构成独一无二的指纹。

引用此