TY - JOUR
T1 - ViT Spatio-Temporal Feature Fusion for Aerial Object Tracking
AU - Guo, Chuangye
AU - Liu, Kang
AU - Deng, Donghu
AU - Li, Xuelong
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - The object tracking technology for aerial remote sensing images has made significant development, but it is still a very challenging work. The related difficulties of object tracking include the accumulation of long-term tracking errors, similar object interference, partial or full occlusion, scale change, etc, which can lead to object tracking failure. In this paper, an aerial object tracker with ViT Spatio-Temporal Feature Fusion (STFF) for the aerial remote sensing images is proposed, which can achieve accurate tracking of aviation objects. Firstly, we propose a spatial-temporal feature fusion strategy based on the characteristics of object tracking timing. In this strategy, the object information of the previous frames is applied to enhance both the real-time responsiveness of the model and the performance of the tracker. Secondly, the dynamic change information of objects in space and time context is used for spatio-temporal feature information fusion, which can further enhance the appropriate correlation and promote the feature aggregation and information transmission of visual tracking. Finally, a dataset with real and virtual scenarios is collected and constructed to address training data requirements for aviation object tracking. According to our experiments, STFF can achieve accurate tracking of aerial objects and has achieved excellent performance on UAV123, DTB70 and our benchmarks.
AB - The object tracking technology for aerial remote sensing images has made significant development, but it is still a very challenging work. The related difficulties of object tracking include the accumulation of long-term tracking errors, similar object interference, partial or full occlusion, scale change, etc, which can lead to object tracking failure. In this paper, an aerial object tracker with ViT Spatio-Temporal Feature Fusion (STFF) for the aerial remote sensing images is proposed, which can achieve accurate tracking of aviation objects. Firstly, we propose a spatial-temporal feature fusion strategy based on the characteristics of object tracking timing. In this strategy, the object information of the previous frames is applied to enhance both the real-time responsiveness of the model and the performance of the tracker. Secondly, the dynamic change information of objects in space and time context is used for spatio-temporal feature information fusion, which can further enhance the appropriate correlation and promote the feature aggregation and information transmission of visual tracking. Finally, a dataset with real and virtual scenarios is collected and constructed to address training data requirements for aviation object tracking. According to our experiments, STFF can achieve accurate tracking of aerial objects and has achieved excellent performance on UAV123, DTB70 and our benchmarks.
KW - Aerial object tracking
KW - self-built dataset
KW - spatio-temporal feature fusion
UR - http://www.scopus.com/inward/record.url?scp=85176304574&partnerID=8YFLogxK
U2 - 10.1109/TCSVT.2023.3326695
DO - 10.1109/TCSVT.2023.3326695
M3 - 文章
AN - SCOPUS:85176304574
SN - 1051-8215
VL - 34
SP - 6749
EP - 6761
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 8
ER -