Pull Pole Points to Text Contour by Magnetism: A Real-Time Scene Text Detector

Research output: Contribution to journalArticlepeer-review

Abstract

Scene text reading plays a crucial role in scene understanding. As its precondition task, scene text detection has garnered increasing interest from researchers. Segmentation-based text detection methods have gained prominence due to their adaptable pixel-level predictions. Many existing methods predict the shrink mask and utilize the Vatti clipping algorithm to reconstruct text contours. However, the shrink mask only focuses on the global geometry feature and shrinks the same distance everywhere, which neglects local contour information and disrupts the instance shape feature. In addition, the post-processing based on the Vatti clipping algorithm heavily relies on the predictions and is relatively complex, causing suboptimal performance in both detection accuracy and efficiency. To address the above problems, we propose an efficient and effective method named Magnetic Text Detector (MTD), inspired by magnetism. It is constructed by a text representation method flexible mask (FM) and a magnetic pull module (MPM). Unlike the shrink mask and concentric mask, the former concerns the local contours and shrinks unfixed distances on different positions, which avoids the truncation issue while preserving distinctiveness from the text regions. The latter generates magnetic fields and pulls pole points of FM to the text contour by magnetism. This allows accurate reconstruction of text contours, even when predictions deviate from the actual text severely, while saving 50% of the post-processing time approximately. Several ablation studies verify the effectiveness of the proposed FM and MPM. Extensive experiments show that our MTD achieves state-of-the-art (SOTA) methods on multiple datasets from different scenes.

Original languageEnglish
Pages (from-to)6374-6385
Number of pages12
JournalIEEE Transactions on Image Processing
Volume34
DOIs
StatePublished - 2025

Keywords

  • Real-time
  • magnetic
  • multi-scene
  • text detection

Fingerprint

Dive into the research topics of 'Pull Pole Points to Text Contour by Magnetism: A Real-Time Scene Text Detector'. Together they form a unique fingerprint.

Cite this