Target-Aware Transformer for Satellite Video Object Tracking

Pujian Lai, Meili Zhang, Gong Cheng, Shengyang Li, Xiankai Huang, Junwei Han

Research output: Contribution to journalArticlepeer-review

20 Scopus citations

Abstract

Recent years have witnessed the astonishing development of transformer-based paradigm in single object tracking (SOT) in generic videos. However, due to the fact that the targets of interest in satellite videos are small in size and weak in visual appearance, the advancements of transformer-based paradigm in satellite video object tracking are impeded. To alleviate this issue, a novel transformer-based recipe is proposed, which consists of a bi-direction propagation and fusion (Bi-PF) strategy and a target-aware enhancement (TAE) module. Concretely, we first adopt the Bi-PF strategy to make full use of multiscale information to generate discriminative representations of tracking targets. Then, the TAE module is employed to decouple an object query into content-aware embedding and spatial-aware embedding and produce a target prototype to help get high-quality content-aware embedding. It is worth mentioning that, different from the previous methods in satellite video tracking most of which evaluate their performance using only several videos, we conduct extensive experiments on the SatSOT dataset which consists of 105 videos. In particular, the proposed method achieves the success score of 45.6% and the precision score of 57.6%, surpassing the baseline method by 5.0% and 9.5%, respectively. The code will be released at https://github.com/laybebe/TATrans_SVOT .

Original languageEnglish
Article number5601410
Pages (from-to)1-10
Number of pages10
JournalIEEE Transactions on Geoscience and Remote Sensing
Volume62
DOIs
StatePublished - 2024

Keywords

  • Bi-direction propagation and fusion (Bi-PF)
  • satellite video object tracking
  • target-aware enhancement (TAE)

Fingerprint

Dive into the research topics of 'Target-Aware Transformer for Satellite Video Object Tracking'. Together they form a unique fingerprint.

Cite this