TY - JOUR
T1 - Learning Adaptive Target-and-Surrounding Soft Mask for Correlation Filter Based Visual Tracking
AU - Zhang, Ke
AU - Wang, Wuwei
AU - Wang, Jingyu
AU - Wang, Qi
AU - Li, Xuelong
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2022/6/1
Y1 - 2022/6/1
N2 - Visual tracking is a very critical issue in computer vision and video processing. For Discriminative Correlation Filter (DCF)-based tracking methods, it is very essential and meaningful to adaptively incorporate reliable target and surrounding information from video frames. However, most existing DCF-based trackers solely rely on pre-defined and fixed constraints such as a binary mask or quadratic function-based regularization to improve the discrimination. Unfortunately, such attempts fail to adjust the constraints according to the change of tracking circumstance in the video sequence, and thus lead to the lack of reliability of learned filters. To mitigate these problems, we present a novel DCF-based tracking method that introduces an adaptive target-and-surrounding soft mask (ATSM) into the learning formula. The adaptive soft mask that is represented by float numbers contains the detail information for both target region and its surrounding information: first, for the background area, it introduces meaningful background information and suppressing uninformative one; second, for the target area inside the bounding box, it helps to focus on the reliable area and repress the rapidly changing area; third, the target-and-surrounding soft mask is adaptively adjusted based on the variations of the target and its surrounding during the tracking process. By jointly modeling the filter and the adaptive soft mask, our ATSM tracker achieves an efficient integration of meaningful information of both foreground and background and performs favorably against state-of-the-art algorithms on seven well-known benchmarks.
AB - Visual tracking is a very critical issue in computer vision and video processing. For Discriminative Correlation Filter (DCF)-based tracking methods, it is very essential and meaningful to adaptively incorporate reliable target and surrounding information from video frames. However, most existing DCF-based trackers solely rely on pre-defined and fixed constraints such as a binary mask or quadratic function-based regularization to improve the discrimination. Unfortunately, such attempts fail to adjust the constraints according to the change of tracking circumstance in the video sequence, and thus lead to the lack of reliability of learned filters. To mitigate these problems, we present a novel DCF-based tracking method that introduces an adaptive target-and-surrounding soft mask (ATSM) into the learning formula. The adaptive soft mask that is represented by float numbers contains the detail information for both target region and its surrounding information: first, for the background area, it introduces meaningful background information and suppressing uninformative one; second, for the target area inside the bounding box, it helps to focus on the reliable area and repress the rapidly changing area; third, the target-and-surrounding soft mask is adaptively adjusted based on the variations of the target and its surrounding during the tracking process. By jointly modeling the filter and the adaptive soft mask, our ATSM tracker achieves an efficient integration of meaningful information of both foreground and background and performs favorably against state-of-the-art algorithms on seven well-known benchmarks.
KW - discriminative correlation filters
KW - soft mask
KW - target and surrounding
KW - video processing
KW - Visual tracking
UR - http://www.scopus.com/inward/record.url?scp=85114643392&partnerID=8YFLogxK
U2 - 10.1109/TCSVT.2021.3108176
DO - 10.1109/TCSVT.2021.3108176
M3 - 文章
AN - SCOPUS:85114643392
SN - 1051-8215
VL - 32
SP - 3708
EP - 3721
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 6
ER -