Efficient thermal infrared tracking with cross-modal compress distillation

Hangfei Li, Yufei Zha, Huanyu Li, Peng Zhang, Wei Huang

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

The key issue of thermal infrared tracking is to use neural networks to represent the target effectively and efficiently in the thermal infrared domain. The lack of thermal infrared trainable datasets makes it difficult to train a robust infrared object tracker from scratch, and the time-consuming convolution operations also make the tracking slow. To address the above problems, we proposed cross-modal compression distillation to represent thermal infrared objects for tracking, by leveraging an off-the-shelf RGB model with knowledge distillation. Specifically, cross-modal distillation is performed to effectively transfer knowledge from RGB modality to thermal infrared modality by inputting paired RGB and thermal infrared images into two branches of a Siamese network. Additionally, based on the teacher–student model architecture, the feature extractor is compressed into a lightweight model by model pruning and multi-level deep feature matching. Experimental results on LSOTB-TIR and PTB-TIR datasets show that the thermal infrared object tracking models distilled by our proposed method achieved faster tracking speed with better performance than the baseline RGB tracker by gaining an improvement of 1.5% Success Rate, 2.2% Precision, and 1.9% Normalized Precision, 58 frames per second (FPS) on LSOTB-TIR dataset, respectively.

Original languageEnglish
Article number106360
JournalEngineering Applications of Artificial Intelligence
Volume123
DOIs
StatePublished - Aug 2023

Keywords

  • Cross-modal
  • Knowledge distillation
  • Thermal infrared tracking

Fingerprint

Dive into the research topics of 'Efficient thermal infrared tracking with cross-modal compress distillation'. Together they form a unique fingerprint.

Cite this