Segmentation in Weakly Labeled Videos via a Semantic Ranking and Optical Warping Network

Le Yang, Junwei Han, Dingwen Zhang, Nian Liu, Dong Zhang

Research output: Contribution to journalArticlepeer-review

20 Scopus citations

Abstract

Weakly supervised video object segmentation (WSVOS) focuses on generating pixel-level object masks for videos only tagged with class labels, which is an essential yet challenging task. For WSVOS, the algorithm is just aware of rough category information rather than the concrete object size and location cues, besides it lacks reliable annotated exemplars to learn temporal evolution in the investigated videos. Basically, there are three challenging factors which may influence the performance of WSVOS: foreground object discovery in each frame, coarse object semantic consistency within each video, and fine-grained segmentation smoothness within neighbor frames. In this paper, we establish a semantic ranking and optical warping network to simultaneously solve these three challenges in a unified framework. For the first challenge, we apply the still image saliency detection method and discover the foreground object for each frame via a segmentation network. Due to the huge discrepancies between the image saliency and the video object segmentation, we step further and propose two subnetworks to solve the other two challenges. For the second one, we propose an attentive semantic ranking subnetwork to mine video-level tags, which can learn discriminative features for semantic ranking and lead to semantic consistent segmentation masks. For the third one, we propose an optical flow warping subnetwork to constrain fine-grained segmentation smoothness within neighbor frames, which can suppress the large deformation and thus obtain smooth object boundaries for adjacent frames. Experiments on two benchmark data sets, i.e., DAVIS data set and YouTube-Objects data set, demonstrate the effectiveness of the proposed approach for segmenting out video objects under weak supervision.

Original languageEnglish
Pages (from-to)4025-4037
Number of pages13
JournalIEEE Transactions on Image Processing
Volume27
Issue number8
DOIs
StatePublished - Aug 2018

Keywords

  • optical warping
  • semantic ranking
  • Video object segmentation
  • weak supervision

Fingerprint

Dive into the research topics of 'Segmentation in Weakly Labeled Videos via a Semantic Ranking and Optical Warping Network'. Together they form a unique fingerprint.

Cite this