ViT Spatio-Temporal Feature Fusion for Aerial Object Tracking

Chuangye Guo, Kang Liu, Donghu Deng, Xuelong Li

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

The object tracking technology for aerial remote sensing images has made significant development, but it is still a very challenging work. The related difficulties of object tracking include the accumulation of long-term tracking errors, similar object interference, partial or full occlusion, scale change, etc, which can lead to object tracking failure. In this paper, an aerial object tracker with ViT Spatio-Temporal Feature Fusion (STFF) for the aerial remote sensing images is proposed, which can achieve accurate tracking of aviation objects. Firstly, we propose a spatial-temporal feature fusion strategy based on the characteristics of object tracking timing. In this strategy, the object information of the previous frames is applied to enhance both the real-time responsiveness of the model and the performance of the tracker. Secondly, the dynamic change information of objects in space and time context is used for spatio-temporal feature information fusion, which can further enhance the appropriate correlation and promote the feature aggregation and information transmission of visual tracking. Finally, a dataset with real and virtual scenarios is collected and constructed to address training data requirements for aviation object tracking. According to our experiments, STFF can achieve accurate tracking of aerial objects and has achieved excellent performance on UAV123, DTB70 and our benchmarks.

Original languageEnglish
Pages (from-to)6749-6761
Number of pages13
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume34
Issue number8
DOIs
StatePublished - 2024

Keywords

  • Aerial object tracking
  • self-built dataset
  • spatio-temporal feature fusion

Fingerprint

Dive into the research topics of 'ViT Spatio-Temporal Feature Fusion for Aerial Object Tracking'. Together they form a unique fingerprint.

Cite this