Text kernel calculation for arbitrary shape text detection

Xu Han; Junyu Gao; Yuan Yuan; Qi Wang

doi:10.1007/s00371-023-02963-2

Text kernel calculation for arbitrary shape text detection

Xu Han, Junyu Gao, Yuan Yuan, Qi Wang

光电与智能研究院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

2 引用（Scopus）

摘要

With the speedy progress of deep learning, text detection has received progressively increasing attention and considerable progress. The current mainstream approaches are usually based on instance segmentation to obtain the label of whether the pixel is text, as this can cope with arbitrary-shaped text. However, pixel-based prediction usually leads to overlapping neighboring texts, resulting in misdetection. To mitigate the above problems, we propose an approach to calculate text kernels and determine the attribution of boundary pixels. This way, all texts are labeled uniformly, facilitating model learning and effectively separating adherent texts. In addition, to cope with the complex and variable background of the text, we propose a practical feature enhancement module to handle it. The proposed module can explore different levels of features to represent text information of diverse sizes. Compared with current advanced algorithms, our method is competitive, which achieves the F1-measure of 87.3, 88.0, 82.8, 85.7, and 90.0% on the ICDAR2015, MSRA-TD500, CTW1500, Total-Text, and ICDAR2013 datasets, respectively.

源语言	英语
页（从-至）	2641-2654
页数	14
期刊	Visual Computer
卷	40
期	4
DOI	https://doi.org/10.1007/s00371-023-02963-2
出版状态	已出版 - 4月 2024

访问文件

10.1007/s00371-023-02963-2

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{b17b59618c59446d8754e4a5253ec671,

title = "Text kernel calculation for arbitrary shape text detection",

abstract = "With the speedy progress of deep learning, text detection has received progressively increasing attention and considerable progress. The current mainstream approaches are usually based on instance segmentation to obtain the label of whether the pixel is text, as this can cope with arbitrary-shaped text. However, pixel-based prediction usually leads to overlapping neighboring texts, resulting in misdetection. To mitigate the above problems, we propose an approach to calculate text kernels and determine the attribution of boundary pixels. This way, all texts are labeled uniformly, facilitating model learning and effectively separating adherent texts. In addition, to cope with the complex and variable background of the text, we propose a practical feature enhancement module to handle it. The proposed module can explore different levels of features to represent text information of diverse sizes. Compared with current advanced algorithms, our method is competitive, which achieves the F1-measure of 87.3, 88.0, 82.8, 85.7, and 90.0% on the ICDAR2015, MSRA-TD500, CTW1500, Total-Text, and ICDAR2013 datasets, respectively.",

keywords = "Arbitrary-shaped text, Instance segmentation, Text detection, Text kernel calculation",

author = "Xu Han and Junyu Gao and Yuan Yuan and Qi Wang",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023.",

year = "2024",

month = apr,

doi = "10.1007/s00371-023-02963-2",

language = "英语",

volume = "40",

pages = "2641--2654",

journal = "Visual Computer",

issn = "0178-2789",

publisher = "Springer Verlag",

number = "4",

}

TY - JOUR

T1 - Text kernel calculation for arbitrary shape text detection

AU - Han, Xu

AU - Gao, Junyu

AU - Yuan, Yuan

AU - Wang, Qi

N1 - Publisher Copyright: © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023.

PY - 2024/4

Y1 - 2024/4

N2 - With the speedy progress of deep learning, text detection has received progressively increasing attention and considerable progress. The current mainstream approaches are usually based on instance segmentation to obtain the label of whether the pixel is text, as this can cope with arbitrary-shaped text. However, pixel-based prediction usually leads to overlapping neighboring texts, resulting in misdetection. To mitigate the above problems, we propose an approach to calculate text kernels and determine the attribution of boundary pixels. This way, all texts are labeled uniformly, facilitating model learning and effectively separating adherent texts. In addition, to cope with the complex and variable background of the text, we propose a practical feature enhancement module to handle it. The proposed module can explore different levels of features to represent text information of diverse sizes. Compared with current advanced algorithms, our method is competitive, which achieves the F1-measure of 87.3, 88.0, 82.8, 85.7, and 90.0% on the ICDAR2015, MSRA-TD500, CTW1500, Total-Text, and ICDAR2013 datasets, respectively.

AB - With the speedy progress of deep learning, text detection has received progressively increasing attention and considerable progress. The current mainstream approaches are usually based on instance segmentation to obtain the label of whether the pixel is text, as this can cope with arbitrary-shaped text. However, pixel-based prediction usually leads to overlapping neighboring texts, resulting in misdetection. To mitigate the above problems, we propose an approach to calculate text kernels and determine the attribution of boundary pixels. This way, all texts are labeled uniformly, facilitating model learning and effectively separating adherent texts. In addition, to cope with the complex and variable background of the text, we propose a practical feature enhancement module to handle it. The proposed module can explore different levels of features to represent text information of diverse sizes. Compared with current advanced algorithms, our method is competitive, which achieves the F1-measure of 87.3, 88.0, 82.8, 85.7, and 90.0% on the ICDAR2015, MSRA-TD500, CTW1500, Total-Text, and ICDAR2013 datasets, respectively.

KW - Arbitrary-shaped text

KW - Instance segmentation

KW - Text detection

KW - Text kernel calculation

UR - http://www.scopus.com/inward/record.url?scp=85163742170&partnerID=8YFLogxK

U2 - 10.1007/s00371-023-02963-2

DO - 10.1007/s00371-023-02963-2

M3 - 文章

AN - SCOPUS:85163742170

SN - 0178-2789

VL - 40

SP - 2641

EP - 2654

JO - Visual Computer

JF - Visual Computer

IS - 4

ER -

Text kernel calculation for arbitrary shape text detection

摘要

访问文件

其它文件与链接

指纹

引用此