TY - JOUR
T1 - Self-Supervised Cross-Modal Distillation for Thermal Infrared Tracking
AU - Zha, Yufei
AU - Sun, Jingxian
AU - Zhang, Peng
AU - Zhang, Lichao
AU - Gonzalez-Garcia, Abel
AU - Huang, Wei
N1 - Publisher Copyright:
© 1994-2012 IEEE.
PY - 2022/10/1
Y1 - 2022/10/1
N2 - Target representations play an important role in performance improvement for Thermal Infrared tracking. To tackle this problem, we propose a Cross-Modal Distillation method to distill representations of the TIR modality from the RGB modality, which conducts on a large amount of unlabeled paired RGB-TIR data in a self-supervised way. Benefiting from the powerful model in the RGB modality, the cross-modal distillation can learn the TIR-specific representation for promoting TIR tracking. The proposed approach can be incorporated into different baseline trackers conveniently as a generic and independent component. In practice, three different approaches are explored to generate paired RGB-TIR patches with the same semantics for training in a self-supervised way. It is easy to extend to an even larger scale of unlabeled training data. Our tracker outperforms the baseline tracker by achieving an absolute gain of 2.3% Success Rate, 2.7% Precision, and 2.5% Norm Precision on published datasets, respectively.
AB - Target representations play an important role in performance improvement for Thermal Infrared tracking. To tackle this problem, we propose a Cross-Modal Distillation method to distill representations of the TIR modality from the RGB modality, which conducts on a large amount of unlabeled paired RGB-TIR data in a self-supervised way. Benefiting from the powerful model in the RGB modality, the cross-modal distillation can learn the TIR-specific representation for promoting TIR tracking. The proposed approach can be incorporated into different baseline trackers conveniently as a generic and independent component. In practice, three different approaches are explored to generate paired RGB-TIR patches with the same semantics for training in a self-supervised way. It is easy to extend to an even larger scale of unlabeled training data. Our tracker outperforms the baseline tracker by achieving an absolute gain of 2.3% Success Rate, 2.7% Precision, and 2.5% Norm Precision on published datasets, respectively.
KW - Cross-Modal Distillation
KW - Self-Supervised
KW - Thermal Infrared Tracking
UR - http://www.scopus.com/inward/record.url?scp=85139391340&partnerID=8YFLogxK
U2 - 10.1109/MMUL.2022.3207239
DO - 10.1109/MMUL.2022.3207239
M3 - 文章
AN - SCOPUS:85139391340
SN - 1070-986X
VL - 29
SP - 80
EP - 96
JO - IEEE Multimedia
JF - IEEE Multimedia
IS - 4
ER -