Joint Learning Spatial-Temporal Attention Correlation Filters for Aerial Tracking

Bo Zhao, Sugang Ma, Zhixian Zhao, Lei Zhang, Zhiqiang Hou

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Discriminative correlation filter (DCF)-based UAV tracking algorithms have drawn much attention due to their outstanding robustness and high computational efficiency. However, these algorithms are easily disturbed by background noise and abrupt changes in target appearance, leading to tracking failure. To address the issues above, we propose a real-time UAV object tracking algorithm with adaptive spatial-temporal attention. Specifically, we construct two filters with different roles based on the training sample's target foreground and environmental background. The spatial attention filter is implemented by incorporating a spatial context regularizer into the traditional DCF paradigm, which fully utilizes background environmental information to suppress background environmental noise and effectively distinguish between the target and the background. The temporal attention filter focuses on the continuity of the target samples, modeling only the target patch samples during the training process and introducing a temporal context regularizer, which substantially enhances the tracker's robustness against target occlusions and deformations. The two are jointly optimized by the Alternating Direction Method of Multipliers (ADMM) algorithm, which is mutually constrained during training and complemented during detection. Extensive experiments on three mainstream UAV benchmarks demonstrate the tracking advantages of the proposed algorithm.

Original languageEnglish
Pages (from-to)686-690
Number of pages5
JournalIEEE Signal Processing Letters
Volume31
DOIs
StatePublished - 2024

Keywords

  • Discriminative correlation filter
  • dual regularization
  • spatial context regularization
  • temporal context regularization
  • unmanned aerial vehicle

Fingerprint

Dive into the research topics of 'Joint Learning Spatial-Temporal Attention Correlation Filters for Aerial Tracking'. Together they form a unique fingerprint.

Cite this