Optimal Kernel for Real-Time Arbitrary-Shaped Text Detection

Haozhao Ma; Chuang Yang; Yuan Yuan; Qi Wang

doi:10.1109/ICASSP49357.2023.10094734

Optimal Kernel for Real-Time Arbitrary-Shaped Text Detection

Haozhao Ma, Chuang Yang, Yuan Yuan, Qi Wang

School of Artificial Intelligence, OPtics and Electronics

Northwestern Polytechnical University Xian

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

2 Scopus citations

Abstract

Recently, segmentation-based text detection methods develop rapidly, which achieve competitive accuracy and detection speed. However, these methods are hard to fit text instances accurately, which leads to the decrease of model performance. Meanwhile, the poor perception of the text center by the boundary pixels further affects the detection accuracy. We follow the issues and design an efficient framework for arbitrary-shaped text detection, which is constructed based on Optimal Kernel Representation (OKR) and Pixel Enhancement Module (PEM). Specifically, OKR is proposed to fit texts with optimal kernels. It erodes texts according to the corresponding geometric characteristics, which is simpler and more accurate compared with previous methods. PEM is used to enhance the perception of boundary pixels to the virtual character centers of text, thus improving the cohesion of the whole instance. Particularly, PEM only participates in the training process, which brings no extra computation costs to inference. Ablation experiments show the effectiveness of OKR and PEM. Comparisons on serveral benchmarks verify that our efficient detector is superior to the existing state-of-the-art (SOTA) methods.

Original language	English
Title of host publication	ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9781728163277
DOIs	https://doi.org/10.1109/ICASSP49357.2023.10094734
State	Published - 2023
Event	48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 - Rhodes Island, Greece Duration: 4 Jun 2023 → 10 Jun 2023

Publication series

Name	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume	2023-June
ISSN (Print)	1520-6149

Conference

Conference	48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
Country/Territory	Greece
City	Rhodes Island
Period	4/06/23 → 10/06/23

Keywords

Efficient text detector
optimal kernel
pixel enhancement

Access to Document

10.1109/ICASSP49357.2023.10094734

Cite this

Ma, H., Yang, C., Yuan, Y., & Wang, Q. (2023). Optimal Kernel for Real-Time Arbitrary-Shaped Text Detection. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2023-June). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP49357.2023.10094734

Ma, Haozhao ; Yang, Chuang ; Yuan, Yuan et al. / Optimal Kernel for Real-Time Arbitrary-Shaped Text Detection. ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings. Institute of Electrical and Electronics Engineers Inc., 2023. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

@inproceedings{7b3f78f4dd1c4e259e27cfa0b5fccf73,

title = "Optimal Kernel for Real-Time Arbitrary-Shaped Text Detection",

abstract = "Recently, segmentation-based text detection methods develop rapidly, which achieve competitive accuracy and detection speed. However, these methods are hard to fit text instances accurately, which leads to the decrease of model performance. Meanwhile, the poor perception of the text center by the boundary pixels further affects the detection accuracy. We follow the issues and design an efficient framework for arbitrary-shaped text detection, which is constructed based on Optimal Kernel Representation (OKR) and Pixel Enhancement Module (PEM). Specifically, OKR is proposed to fit texts with optimal kernels. It erodes texts according to the corresponding geometric characteristics, which is simpler and more accurate compared with previous methods. PEM is used to enhance the perception of boundary pixels to the virtual character centers of text, thus improving the cohesion of the whole instance. Particularly, PEM only participates in the training process, which brings no extra computation costs to inference. Ablation experiments show the effectiveness of OKR and PEM. Comparisons on serveral benchmarks verify that our efficient detector is superior to the existing state-of-the-art (SOTA) methods.",

keywords = "Efficient text detector, optimal kernel, pixel enhancement",

author = "Haozhao Ma and Chuang Yang and Yuan Yuan and Qi Wang",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 ; Conference date: 04-06-2023 Through 10-06-2023",

year = "2023",

doi = "10.1109/ICASSP49357.2023.10094734",

language = "英语",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings",

}

Ma, H, Yang, C, Yuan, Y & Wang, Q 2023, Optimal Kernel for Real-Time Arbitrary-Shaped Text Detection. in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2023-June, Institute of Electrical and Electronics Engineers Inc., 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023, Rhodes Island, Greece, 4/06/23. https://doi.org/10.1109/ICASSP49357.2023.10094734

Optimal Kernel for Real-Time Arbitrary-Shaped Text Detection. / Ma, Haozhao; Yang, Chuang; Yuan, Yuan et al.
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings. Institute of Electrical and Electronics Engineers Inc., 2023. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2023-June).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Optimal Kernel for Real-Time Arbitrary-Shaped Text Detection

AU - Ma, Haozhao

AU - Yang, Chuang

AU - Yuan, Yuan

AU - Wang, Qi

PY - 2023

Y1 - 2023

N2 - Recently, segmentation-based text detection methods develop rapidly, which achieve competitive accuracy and detection speed. However, these methods are hard to fit text instances accurately, which leads to the decrease of model performance. Meanwhile, the poor perception of the text center by the boundary pixels further affects the detection accuracy. We follow the issues and design an efficient framework for arbitrary-shaped text detection, which is constructed based on Optimal Kernel Representation (OKR) and Pixel Enhancement Module (PEM). Specifically, OKR is proposed to fit texts with optimal kernels. It erodes texts according to the corresponding geometric characteristics, which is simpler and more accurate compared with previous methods. PEM is used to enhance the perception of boundary pixels to the virtual character centers of text, thus improving the cohesion of the whole instance. Particularly, PEM only participates in the training process, which brings no extra computation costs to inference. Ablation experiments show the effectiveness of OKR and PEM. Comparisons on serveral benchmarks verify that our efficient detector is superior to the existing state-of-the-art (SOTA) methods.

AB - Recently, segmentation-based text detection methods develop rapidly, which achieve competitive accuracy and detection speed. However, these methods are hard to fit text instances accurately, which leads to the decrease of model performance. Meanwhile, the poor perception of the text center by the boundary pixels further affects the detection accuracy. We follow the issues and design an efficient framework for arbitrary-shaped text detection, which is constructed based on Optimal Kernel Representation (OKR) and Pixel Enhancement Module (PEM). Specifically, OKR is proposed to fit texts with optimal kernels. It erodes texts according to the corresponding geometric characteristics, which is simpler and more accurate compared with previous methods. PEM is used to enhance the perception of boundary pixels to the virtual character centers of text, thus improving the cohesion of the whole instance. Particularly, PEM only participates in the training process, which brings no extra computation costs to inference. Ablation experiments show the effectiveness of OKR and PEM. Comparisons on serveral benchmarks verify that our efficient detector is superior to the existing state-of-the-art (SOTA) methods.

KW - Efficient text detector

KW - optimal kernel

KW - pixel enhancement

UR - http://www.scopus.com/inward/record.url?scp=85177572741&partnerID=8YFLogxK

U2 - 10.1109/ICASSP49357.2023.10094734

DO - 10.1109/ICASSP49357.2023.10094734

M3 - 会议稿件

AN - SCOPUS:85177572741

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

BT - ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023

Y2 - 4 June 2023 through 10 June 2023

ER -

Ma H, Yang C, Yuan Y , Wang Q. Optimal Kernel for Real-Time Arbitrary-Shaped Text Detection. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings. Institute of Electrical and Electronics Engineers Inc. 2023. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). doi: 10.1109/ICASSP49357.2023.10094734

Optimal Kernel for Real-Time Arbitrary-Shaped Text Detection

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this