Text kernel calculation for arbitrary shape text detection

Xu Han; Junyu Gao; Yuan Yuan; Qi Wang

doi:10.1007/s00371-023-02963-2

Text kernel calculation for arbitrary shape text detection

Xu Han, Junyu Gao, Yuan Yuan, Qi Wang

School of Artificial Intelligence, OPtics and Electronics

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

2 Scopus citations

Abstract

With the speedy progress of deep learning, text detection has received progressively increasing attention and considerable progress. The current mainstream approaches are usually based on instance segmentation to obtain the label of whether the pixel is text, as this can cope with arbitrary-shaped text. However, pixel-based prediction usually leads to overlapping neighboring texts, resulting in misdetection. To mitigate the above problems, we propose an approach to calculate text kernels and determine the attribution of boundary pixels. This way, all texts are labeled uniformly, facilitating model learning and effectively separating adherent texts. In addition, to cope with the complex and variable background of the text, we propose a practical feature enhancement module to handle it. The proposed module can explore different levels of features to represent text information of diverse sizes. Compared with current advanced algorithms, our method is competitive, which achieves the F1-measure of 87.3, 88.0, 82.8, 85.7, and 90.0% on the ICDAR2015, MSRA-TD500, CTW1500, Total-Text, and ICDAR2013 datasets, respectively.

Original language	English
Pages (from-to)	2641-2654
Number of pages	14
Journal	Visual Computer
Volume	40
Issue number	4
DOIs	https://doi.org/10.1007/s00371-023-02963-2
State	Published - Apr 2024

Keywords

Arbitrary-shaped text
Instance segmentation
Text detection
Text kernel calculation

Access to Document

10.1007/s00371-023-02963-2

Cite this

@article{b17b59618c59446d8754e4a5253ec671,

title = "Text kernel calculation for arbitrary shape text detection",

abstract = "With the speedy progress of deep learning, text detection has received progressively increasing attention and considerable progress. The current mainstream approaches are usually based on instance segmentation to obtain the label of whether the pixel is text, as this can cope with arbitrary-shaped text. However, pixel-based prediction usually leads to overlapping neighboring texts, resulting in misdetection. To mitigate the above problems, we propose an approach to calculate text kernels and determine the attribution of boundary pixels. This way, all texts are labeled uniformly, facilitating model learning and effectively separating adherent texts. In addition, to cope with the complex and variable background of the text, we propose a practical feature enhancement module to handle it. The proposed module can explore different levels of features to represent text information of diverse sizes. Compared with current advanced algorithms, our method is competitive, which achieves the F1-measure of 87.3, 88.0, 82.8, 85.7, and 90.0% on the ICDAR2015, MSRA-TD500, CTW1500, Total-Text, and ICDAR2013 datasets, respectively.",

keywords = "Arbitrary-shaped text, Instance segmentation, Text detection, Text kernel calculation",

author = "Xu Han and Junyu Gao and Yuan Yuan and Qi Wang",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023.",

year = "2024",

month = apr,

doi = "10.1007/s00371-023-02963-2",

language = "英语",

volume = "40",

pages = "2641--2654",

journal = "Visual Computer",

issn = "0178-2789",

publisher = "Springer Verlag",

number = "4",

}

TY - JOUR

T1 - Text kernel calculation for arbitrary shape text detection

AU - Han, Xu

AU - Gao, Junyu

AU - Yuan, Yuan

AU - Wang, Qi

N1 - Publisher Copyright: © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023.

PY - 2024/4

Y1 - 2024/4

N2 - With the speedy progress of deep learning, text detection has received progressively increasing attention and considerable progress. The current mainstream approaches are usually based on instance segmentation to obtain the label of whether the pixel is text, as this can cope with arbitrary-shaped text. However, pixel-based prediction usually leads to overlapping neighboring texts, resulting in misdetection. To mitigate the above problems, we propose an approach to calculate text kernels and determine the attribution of boundary pixels. This way, all texts are labeled uniformly, facilitating model learning and effectively separating adherent texts. In addition, to cope with the complex and variable background of the text, we propose a practical feature enhancement module to handle it. The proposed module can explore different levels of features to represent text information of diverse sizes. Compared with current advanced algorithms, our method is competitive, which achieves the F1-measure of 87.3, 88.0, 82.8, 85.7, and 90.0% on the ICDAR2015, MSRA-TD500, CTW1500, Total-Text, and ICDAR2013 datasets, respectively.

AB - With the speedy progress of deep learning, text detection has received progressively increasing attention and considerable progress. The current mainstream approaches are usually based on instance segmentation to obtain the label of whether the pixel is text, as this can cope with arbitrary-shaped text. However, pixel-based prediction usually leads to overlapping neighboring texts, resulting in misdetection. To mitigate the above problems, we propose an approach to calculate text kernels and determine the attribution of boundary pixels. This way, all texts are labeled uniformly, facilitating model learning and effectively separating adherent texts. In addition, to cope with the complex and variable background of the text, we propose a practical feature enhancement module to handle it. The proposed module can explore different levels of features to represent text information of diverse sizes. Compared with current advanced algorithms, our method is competitive, which achieves the F1-measure of 87.3, 88.0, 82.8, 85.7, and 90.0% on the ICDAR2015, MSRA-TD500, CTW1500, Total-Text, and ICDAR2013 datasets, respectively.

KW - Arbitrary-shaped text

KW - Instance segmentation

KW - Text detection

KW - Text kernel calculation

UR - http://www.scopus.com/inward/record.url?scp=85163742170&partnerID=8YFLogxK

U2 - 10.1007/s00371-023-02963-2

DO - 10.1007/s00371-023-02963-2

M3 - 文章

AN - SCOPUS:85163742170

SN - 0178-2789

VL - 40

SP - 2641

EP - 2654

JO - Visual Computer

JF - Visual Computer

IS - 4

ER -

Text kernel calculation for arbitrary shape text detection

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this