Abstract
The detection of text instances with arbitrary styles remains a major source of errors in scene text understanding. The multi-lingual, multi-oriented and multi-scale problems greatly reduce text salience. While recent studies have proposed various learning frameworks to tackle this issue, many rely on complex post-processing and struggle to handle extreme scale variations. Towards this end, we explore a unified coarse-to-fine framework via multi-scale cross-knowledge learning for arbitrary-shape text detection. Unlike previous methods that model feature point correlations in a holistic manner, our approach adaptively selects a small set of key sampling points around the reference. This not only reduces computational overhead across varying scales but also mitigates interference from background noise. Moreover, incorporating multi-scale information under semantic priors further strengthens the reliability of dependency modelling. Extensive experiments on widely used benchmarks demonstrate that our method, guided by cross-knowledge and adaptive attention, achieves competitive performance. Specifically, it attains F-measure scores of 92.8% on MSRA-TD500, 87.1% on MSRA-TD500, and 90.5% on Total-Text.
| Original language | English |
|---|---|
| Article number | 112744 |
| Journal | Engineering Applications of Artificial Intelligence |
| Volume | 164 |
| DOIs | |
| State | Published - 15 Jan 2026 |
Keywords
- Arbitrary-shape text detection
- Boundary optimisation
- Coarse-to-fine learning
- Multi-scale cross-knowledge
Fingerprint
Dive into the research topics of 'Progressive boundary optimisation with cross-knowledge enhancement for arbitrary-shape text detection'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver