Self-Supervised Cross-Modal Distillation for Thermal Infrared Tracking

Yufei Zha, Jingxian Sun, Peng Zhang, Lichao Zhang, Abel Gonzalez-Garcia, Wei Huang

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Target representations play an important role in performance improvement for Thermal Infrared tracking. To tackle this problem, we propose a Cross-Modal Distillation method to distill representations of the TIR modality from the RGB modality, which conducts on a large amount of unlabeled paired RGB-TIR data in a self-supervised way. Benefiting from the powerful model in the RGB modality, the cross-modal distillation can learn the TIR-specific representation for promoting TIR tracking. The proposed approach can be incorporated into different baseline trackers conveniently as a generic and independent component. In practice, three different approaches are explored to generate paired RGB-TIR patches with the same semantics for training in a self-supervised way. It is easy to extend to an even larger scale of unlabeled training data. Our tracker outperforms the baseline tracker by achieving an absolute gain of 2.3% Success Rate, 2.7% Precision, and 2.5% Norm Precision on published datasets, respectively.

Original languageEnglish
Pages (from-to)80-96
Number of pages17
JournalIEEE Multimedia
Volume29
Issue number4
DOIs
StatePublished - 1 Oct 2022

Keywords

  • Cross-Modal Distillation
  • Self-Supervised
  • Thermal Infrared Tracking

Fingerprint

Dive into the research topics of 'Self-Supervised Cross-Modal Distillation for Thermal Infrared Tracking'. Together they form a unique fingerprint.

Cite this