Skip to main navigation Skip to search Skip to main content

Vision-Inspired Transformer-Based Thermal Infrared Target Tracking Framework for Internet of Things

  • Shaoyang Ma
  • , Kai Zhang
  • , Yao Yang
  • , Qiyan Liu
  • , Gang Chen
  • Xi'an Jiaotong University
  • Northwestern Polytechnical University Xian

Research output: Contribution to journalArticlepeer-review

Abstract

Thermal Infrared (TIR) tracking is pivotal for Internet of Things (IoT) applications, providing robust perception in adverse weather and low-light conditions. However, the lack of color and texture in infrared imagery severely limits target discrimination, undermining tracking stability in complex real-world environments. To address these challenges, we propose a vision-inspired transformer-based TIR tracking framework that draws on the core principles of the human visual system. Specifically, a dynamic adaptive appearance context network, inspired by the ventral pathway, models long-term appearance variations, while a trajectory encoding module, reflecting the dorsal pathway, captures motion dynamics. These complementary streams are fused via a spatiotemporal consistency module that emulates the integrative function of the prefrontal cortex. An adaptive memory update strategy further preserves historical information under occlusion and interference. Experiments on the LSOTB-TIR and PTB-TIR benchmarks demonstrate state-of-the-art performance, with success rates of 72.8% and 73.3%, exceeding the baseline by 6.3% and 8.0%. Moreover, the tracking speed of 51 FPS underscores that the proposed method combines superior accuracy with real-time efficiency, paving the way for more resilient and intelligent IoT systems.

Original languageEnglish
JournalIEEE Internet of Things Journal
DOIs
StateAccepted/In press - 2025

Keywords

  • Internet of Things
  • Temporal context
  • Thermal Infrared
  • Vision Transformer
  • Vision-Inspired

Fingerprint

Dive into the research topics of 'Vision-Inspired Transformer-Based Thermal Infrared Target Tracking Framework for Internet of Things'. Together they form a unique fingerprint.

Cite this