TY - GEN
T1 - Deformable object tracking with spatiotemporal segmentation in big vision surveillance
AU - Zhuo, Tao
AU - Zhang, Peng
AU - Zhang, Yanning
AU - Huang, Wei
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/12/11
Y1 - 2014/12/11
N2 - The rapid development of worldwide networks has changed many challenge problems from video level to big video level for vision based surveillance. An important technique for big video processing is to extract the salient information from the video datasea effectively. As a fundamental function for data analysis such as behavior understanding for social security, object tracking usually plays an essential role by separating the salient areas from the background scenarios in video. But object tracking in realistic environments is not easy because the appearance configuration of a realistic object may have continual deformation during the movement. In conventional online tracking-by-learning studies, fix-shape appearance modeling is usually utilized for training samples generation due to its applicable simplicity and convenience. Unfortunately, for generic deformable objects, this modeling approach may wrongly discriminate some background areas as the part of object, which is supposed to deteriorate the model update during online learning. Therefore, employing the object segmentation to obtain more precise foreground areas for learning sample generation has been proposed recently to resolve this problem, but a common limitation of these approaches is that the object segmentation was only performed in spatial domain rather than spatiotemporal domain of the video. Therefore, when the background texture is similar to the target object, tracking failure happens because accurate segmentation is hard to be achieved. In this paper, a motion-appearance model for deformable object segmentation is proposed by incorporating pixel based gradients flow in the spatiotemporal domain. With motion information between the consecutive frames, the irregular-shaped object can be accurately segmented by energy function optimization and boundary convergence and the proposed segmentation is then incorporated into a structural SVM tracking framework for online learning sample generation. We have evaluated the proposed tracking on the benchmark video as well as the surveillance video datasets including heavy intrinsic variations and occlusions, as a demonstration, the experiment results has verified a significant improvement in tracking accuracy and robustness in comparison with other state-of-art tracking works.
AB - The rapid development of worldwide networks has changed many challenge problems from video level to big video level for vision based surveillance. An important technique for big video processing is to extract the salient information from the video datasea effectively. As a fundamental function for data analysis such as behavior understanding for social security, object tracking usually plays an essential role by separating the salient areas from the background scenarios in video. But object tracking in realistic environments is not easy because the appearance configuration of a realistic object may have continual deformation during the movement. In conventional online tracking-by-learning studies, fix-shape appearance modeling is usually utilized for training samples generation due to its applicable simplicity and convenience. Unfortunately, for generic deformable objects, this modeling approach may wrongly discriminate some background areas as the part of object, which is supposed to deteriorate the model update during online learning. Therefore, employing the object segmentation to obtain more precise foreground areas for learning sample generation has been proposed recently to resolve this problem, but a common limitation of these approaches is that the object segmentation was only performed in spatial domain rather than spatiotemporal domain of the video. Therefore, when the background texture is similar to the target object, tracking failure happens because accurate segmentation is hard to be achieved. In this paper, a motion-appearance model for deformable object segmentation is proposed by incorporating pixel based gradients flow in the spatiotemporal domain. With motion information between the consecutive frames, the irregular-shaped object can be accurately segmented by energy function optimization and boundary convergence and the proposed segmentation is then incorporated into a structural SVM tracking framework for online learning sample generation. We have evaluated the proposed tracking on the benchmark video as well as the surveillance video datasets including heavy intrinsic variations and occlusions, as a demonstration, the experiment results has verified a significant improvement in tracking accuracy and robustness in comparison with other state-of-art tracking works.
UR - http://www.scopus.com/inward/record.url?scp=84920725926&partnerID=8YFLogxK
U2 - 10.1109/SPAC.2014.6982647
DO - 10.1109/SPAC.2014.6982647
M3 - 会议稿件
AN - SCOPUS:84920725926
T3 - Proceedings 2014 IEEE International Conference on Security, Pattern Analysis, and Cybernetics, SPAC 2014
SP - 1
EP - 6
BT - Proceedings 2014 IEEE International Conference on Security, Pattern Analysis, and Cybernetics, SPAC 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 IEEE International Conference on Security, Pattern Analysis, and Cybernetics, SPAC 2014
Y2 - 18 October 2014 through 19 October 2014
ER -