Abstract
In visual object tracking, the disparity in performance between lightweight and heavyweight trackers presents a significant challenge, particularly for applications on edge devices. We introduce the Cross-Alignment Tracker (CAT), a novel dual-branch framework designed to bridge the performance gap by integrating both lightweight and heavyweight tracking components. During training, CAT processes inputs through two parallel branches, each consisting of a backbone network and a tracking head—one lightweight and the other heavyweight. This design facilitates simultaneous learning and output alignment across branches, thereby enhancing the performance of the lightweight tracker in a coordinated manner. A key aspect of our approach is the adaptation and alignment of outputs from both branches, ensuring that the accuracy and robustness of the heavyweight tracker can effectively guide and improve the learning of its lightweight counterpart. During online tracking, CAT leverages cross-combinations between the two branches to simultaneously generate four distinct one-stream trackers, each with varying levels of computational complexity and parameter size. Empirical evaluations, notably on the GOT-10k benchmark, reveal that CAT-Tiny surpasses existing real-time trackers by 4.8%, approaching the prowess of larger, high-performance models. Remarkably, a single training session yields four distinct model sizes, each tailored for varied tracking demands, showcasing the method's unparalleled efficiency and scalability in advancing real-time object-tracking technology.
| Original language | English |
|---|---|
| Article number | 112048 |
| Journal | Pattern Recognition |
| Volume | 170 |
| DOIs | |
| State | Published - Feb 2026 |
Keywords
- Cross-learning
- Feature distillation
- Lightweight tracking
- Predicted distillation
- Transformer
Fingerprint
Dive into the research topics of 'Cross-alignment for efficient visual object tracking'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver