跳到主要导航 跳到搜索 跳到主要内容

Compensating for the Incomplete With the Complete: An Efficient Scene Text Detector

  • Northwestern Polytechnical University Xian

科研成果: 期刊稿件文章同行评审

摘要

Scene text reading is an essential component of scene understanding. As its fundamental requirement, text detection has garnered increasing attention. Segmenting the text kernel and extending it to reconstruct text instances is efficient and effective among the various methods. However, the incomplete semantic features of text kernels and the high similarity between kernels and texts make it hard to extract kernels from images accurately. Considering the above, we propose an efficient text detector, termed CIC, which comprises a bidirectional information transfer module (BITM), a dual knowledge integration module (DKIM), and a cross-verification module (CVM). The former generates collaborative information between the predicted text and kernel via the proposed differentiable adaptive gap operator. It forces mutual restraint and collaborative progress between the predictions of text and kernel. Unlike BITM, DKIM designs a knowledge fuse scheme, which helps to locate kernels accurately under the guidance of the complete semantic feature of texts. Intuitively, as the kernel is generated by shrinking the text, the kernel pixel is only presented in the text area. Based on this criterion, the CVM further utilizes text predictions to constrain kernel predictions and reduce false positive predictions. Ablation experiments demonstrate the effectiveness of the proposed BITM, DKIM, and CVM. Extensive experiments show the proposed CIC outperforms existing state-of-the-art (SOTA) methods on five public datasets from different scenes.

源语言英语
页(从-至)12096-12108
页数13
期刊IEEE Transactions on Circuits and Systems for Video Technology
35
12
DOI
出版状态已出版 - 2025

指纹

探究 'Compensating for the Incomplete With the Complete: An Efficient Scene Text Detector' 的科研主题。它们共同构成独一无二的指纹。

引用此