Joint graph learning and video segmentation via multiple cues and topology calibration

Jingkuan Song; Lianli Gao; Mihai Marian Puscas; Feiping Nie; Fumin Shen; Nicu Sebe

doi:10.1145/2964284.2964295

Joint graph learning and video segmentation via multiple cues and topology calibration

Jingkuan Song, Lianli Gao, Mihai Marian Puscas, Feiping Nie, Fumin Shen, Nicu Sebe

光电与智能研究院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

19 引用（Scopus）

摘要

Video segmentation has become an important and active research area with a large diversity of proposed approaches. Graph-based methods, enabling top performance on recent benchmarks, usually focus on either obtaining a precise similarity graph or designing efficient graph cutting strategies. However, these two components are often conducted in two separated steps, and thus the obtained similarity graph may not be the optimal one for segmentation and this may lead to suboptimal results. In this paper, we propose a novel framework, joint graph learning and video segmentation (JGLVS), which learns the similarity graph and video segmentation simultaneously. JGLVS learns the similarity graph by assigning adaptive neighbors for each vertex based on multiple cues (appearance, motion, boundary and spatial information). Meanwhile, the new rank constraint is imposed to the Laplacian matrix of the similarity graph, such that the connected components in the resulted similarity graph are exactly equal to the number of segmentations. Furthermore, JGLVS can automatically weigh multiple cues and calibrate the pairwise distance of superpixels based on their topology structures. Most noticeably, empirical results on the challenging dataset VSB100 show that JGLVS achieves promising performance on the benchmark dataset which outperforms the state-of-the-art by up to 11% for the BPR metric.

源语言	英语
主期刊名	MM 2016 - Proceedings of the 2016 ACM Multimedia Conference
出版商	Association for Computing Machinery, Inc
页	831-840
页数	10
ISBN（电子版）	9781450336031
DOI	https://doi.org/10.1145/2964284.2964295
出版状态	已出版 - 1 10月 2016
活动	24th ACM Multimedia Conference, MM 2016 - Amsterdam, 英国期限: 15 10月 2016 → 19 10月 2016

出版系列

姓名	MM 2016 - Proceedings of the 2016 ACM Multimedia Conference

会议

会议	24th ACM Multimedia Conference, MM 2016
国家/地区	英国
市	Amsterdam
时期	15/10/16 → 19/10/16

访问文件

10.1145/2964284.2964295

其它文件与链接

链接到 Scopus 的出版物

引用此

Song, J., Gao, L., Puscas, M. M., Nie, F., Shen, F., & Sebe, N. (2016). Joint graph learning and video segmentation via multiple cues and topology calibration. 在 MM 2016 - Proceedings of the 2016 ACM Multimedia Conference (页码 831-840). (MM 2016 - Proceedings of the 2016 ACM Multimedia Conference). Association for Computing Machinery, Inc. https://doi.org/10.1145/2964284.2964295

@inproceedings{8732f87012af40ee805a532fa793c9c6,

title = "Joint graph learning and video segmentation via multiple cues and topology calibration",

abstract = "Video segmentation has become an important and active research area with a large diversity of proposed approaches. Graph-based methods, enabling top performance on recent benchmarks, usually focus on either obtaining a precise similarity graph or designing efficient graph cutting strategies. However, these two components are often conducted in two separated steps, and thus the obtained similarity graph may not be the optimal one for segmentation and this may lead to suboptimal results. In this paper, we propose a novel framework, joint graph learning and video segmentation (JGLVS), which learns the similarity graph and video segmentation simultaneously. JGLVS learns the similarity graph by assigning adaptive neighbors for each vertex based on multiple cues (appearance, motion, boundary and spatial information). Meanwhile, the new rank constraint is imposed to the Laplacian matrix of the similarity graph, such that the connected components in the resulted similarity graph are exactly equal to the number of segmentations. Furthermore, JGLVS can automatically weigh multiple cues and calibrate the pairwise distance of superpixels based on their topology structures. Most noticeably, empirical results on the challenging dataset VSB100 show that JGLVS achieves promising performance on the benchmark dataset which outperforms the state-of-the-art by up to 11% for the BPR metric.",

keywords = "Graph-based method, Multiple cues, Topology, Video segmentation",

author = "Jingkuan Song and Lianli Gao and Puscas, {Mihai Marian} and Feiping Nie and Fumin Shen and Nicu Sebe",

note = "Publisher Copyright: {\textcopyright} 2016 ACM.; 24th ACM Multimedia Conference, MM 2016 ; Conference date: 15-10-2016 Through 19-10-2016",

year = "2016",

month = oct,

day = "1",

doi = "10.1145/2964284.2964295",

language = "英语",

series = "MM 2016 - Proceedings of the 2016 ACM Multimedia Conference",

publisher = "Association for Computing Machinery, Inc",

pages = "831--840",

booktitle = "MM 2016 - Proceedings of the 2016 ACM Multimedia Conference",

}

Song, J, Gao, L, Puscas, MM, Nie, F, Shen, F & Sebe, N 2016, Joint graph learning and video segmentation via multiple cues and topology calibration. 在 MM 2016 - Proceedings of the 2016 ACM Multimedia Conference. MM 2016 - Proceedings of the 2016 ACM Multimedia Conference, Association for Computing Machinery, Inc, 页码 831-840, 24th ACM Multimedia Conference, MM 2016, Amsterdam, 英国, 15/10/16. https://doi.org/10.1145/2964284.2964295

Joint graph learning and video segmentation via multiple cues and topology calibration. / Song, Jingkuan; Gao, Lianli; Puscas, Mihai Marian 等.
MM 2016 - Proceedings of the 2016 ACM Multimedia Conference. Association for Computing Machinery, Inc, 2016. 页码 831-840 (MM 2016 - Proceedings of the 2016 ACM Multimedia Conference).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Joint graph learning and video segmentation via multiple cues and topology calibration

AU - Song, Jingkuan

AU - Gao, Lianli

AU - Puscas, Mihai Marian

AU - Nie, Feiping

AU - Shen, Fumin

AU - Sebe, Nicu

PY - 2016/10/1

Y1 - 2016/10/1

N2 - Video segmentation has become an important and active research area with a large diversity of proposed approaches. Graph-based methods, enabling top performance on recent benchmarks, usually focus on either obtaining a precise similarity graph or designing efficient graph cutting strategies. However, these two components are often conducted in two separated steps, and thus the obtained similarity graph may not be the optimal one for segmentation and this may lead to suboptimal results. In this paper, we propose a novel framework, joint graph learning and video segmentation (JGLVS), which learns the similarity graph and video segmentation simultaneously. JGLVS learns the similarity graph by assigning adaptive neighbors for each vertex based on multiple cues (appearance, motion, boundary and spatial information). Meanwhile, the new rank constraint is imposed to the Laplacian matrix of the similarity graph, such that the connected components in the resulted similarity graph are exactly equal to the number of segmentations. Furthermore, JGLVS can automatically weigh multiple cues and calibrate the pairwise distance of superpixels based on their topology structures. Most noticeably, empirical results on the challenging dataset VSB100 show that JGLVS achieves promising performance on the benchmark dataset which outperforms the state-of-the-art by up to 11% for the BPR metric.

AB - Video segmentation has become an important and active research area with a large diversity of proposed approaches. Graph-based methods, enabling top performance on recent benchmarks, usually focus on either obtaining a precise similarity graph or designing efficient graph cutting strategies. However, these two components are often conducted in two separated steps, and thus the obtained similarity graph may not be the optimal one for segmentation and this may lead to suboptimal results. In this paper, we propose a novel framework, joint graph learning and video segmentation (JGLVS), which learns the similarity graph and video segmentation simultaneously. JGLVS learns the similarity graph by assigning adaptive neighbors for each vertex based on multiple cues (appearance, motion, boundary and spatial information). Meanwhile, the new rank constraint is imposed to the Laplacian matrix of the similarity graph, such that the connected components in the resulted similarity graph are exactly equal to the number of segmentations. Furthermore, JGLVS can automatically weigh multiple cues and calibrate the pairwise distance of superpixels based on their topology structures. Most noticeably, empirical results on the challenging dataset VSB100 show that JGLVS achieves promising performance on the benchmark dataset which outperforms the state-of-the-art by up to 11% for the BPR metric.

KW - Graph-based method

KW - Multiple cues

KW - Topology

KW - Video segmentation

UR - http://www.scopus.com/inward/record.url?scp=84994613586&partnerID=8YFLogxK

U2 - 10.1145/2964284.2964295

DO - 10.1145/2964284.2964295

M3 - 会议稿件

AN - SCOPUS:84994613586

T3 - MM 2016 - Proceedings of the 2016 ACM Multimedia Conference

SP - 831

EP - 840

BT - MM 2016 - Proceedings of the 2016 ACM Multimedia Conference

PB - Association for Computing Machinery, Inc

T2 - 24th ACM Multimedia Conference, MM 2016

Y2 - 15 October 2016 through 19 October 2016

ER -

Joint graph learning and video segmentation via multiple cues and topology calibration

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此