TY - GEN
T1 - SPFTN
T2 - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017
AU - Zhang, Dingwen
AU - Yang, Le
AU - Meng, Deyu
AU - Xu, Dong
AU - Han, Junwei
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/11/6
Y1 - 2017/11/6
N2 - Object segmentation in weakly labelled videos is an interesting yet challenging task, which aims at learning to perform category-specific video object segmentation by only using video-level tags. Existing works in this research area might still have some limitations, e.g., lack of effective DNN-based learning frameworks, under-exploring the context information, and requiring to leverage the unstable negative video collection, which prevent them from obtaining more promising performance. To this end, we propose a novel self-paced fine-tuning network (SPFTN)-based framework, which could learn to explore the context information within the video frames and capture adequate object semantics without using the negative videos. To perform weakly supervised learning based on the deep neural network, we make the earliest effort to integrate the self-paced learning regime and the deep neural network into a unified and compatible framework, leading to the self-paced fine-tuning network. Comprehensive experiments on the large-scale YouTube-Objects and DAVIS datasets demonstrate that the proposed approach achieves superior performance as compared with other state-of-the-art methods as well as the baseline networks and models.
AB - Object segmentation in weakly labelled videos is an interesting yet challenging task, which aims at learning to perform category-specific video object segmentation by only using video-level tags. Existing works in this research area might still have some limitations, e.g., lack of effective DNN-based learning frameworks, under-exploring the context information, and requiring to leverage the unstable negative video collection, which prevent them from obtaining more promising performance. To this end, we propose a novel self-paced fine-tuning network (SPFTN)-based framework, which could learn to explore the context information within the video frames and capture adequate object semantics without using the negative videos. To perform weakly supervised learning based on the deep neural network, we make the earliest effort to integrate the self-paced learning regime and the deep neural network into a unified and compatible framework, leading to the self-paced fine-tuning network. Comprehensive experiments on the large-scale YouTube-Objects and DAVIS datasets demonstrate that the proposed approach achieves superior performance as compared with other state-of-the-art methods as well as the baseline networks and models.
UR - http://www.scopus.com/inward/record.url?scp=85040645052&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2017.567
DO - 10.1109/CVPR.2017.567
M3 - 会议稿件
AN - SCOPUS:85040645052
T3 - Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017
SP - 5340
EP - 5348
BT - Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 21 July 2017 through 26 July 2017
ER -