Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation

Nian Liu, Kepan Nan, Wangbo Zhao, Yuanwei Liu, Xiwen Yao, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Junwei Han, Fahad Shahbaz Khan

科研成果: 书/报告/会议事项章节会议稿件同行评审

12 引用 (Scopus)

摘要

Few-Shot Video Object Segmentation (FSVOS) aims to segment objects in a query video with the same category defined by a few annotated support images. However, this task was seldom explored. In this work, based on IPMT, a state-of-the-art few-shot image segmentation method that combines external support guidance information with adaptive query guidance cues, we propose to leverage multi-grained temporal guidance information for handling the temporal correlation nature of video data. We decompose the query video information into a clip prototype and a memory prototype for capturing local and long-term internal temporal guidance, respectively. Frame prototypes are further used for each frame independently to handle fine-grained adaptive guidance and enable bidirectional clip-frame prototype communication. To reduce the influence of noisy memory, we propose to leverage the structural similarity relation among different predicted regions and the support for selecting reliable memory frames. Furthermore, a new segmentation loss is also proposed to enhance the category discriminability of the learned prototypes. Experimental results demonstrate that our proposed video IPMT model significantly outperforms previous models on two benchmark datasets. Code is available at https://github.com/nankepan/VIPMT.

源语言英语
主期刊名Proceedings - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023
出版商Institute of Electrical and Electronics Engineers Inc.
18816-18825
页数10
ISBN(电子版)9798350307184
DOI
出版状态已出版 - 2023
活动2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023 - Paris, 法国
期限: 2 10月 20236 10月 2023

出版系列

姓名Proceedings of the IEEE International Conference on Computer Vision
ISSN(印刷版)1550-5499

会议

会议2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023
国家/地区法国
Paris
时期2/10/236/10/23

指纹

探究 'Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation' 的科研主题。它们共同构成独一无二的指纹。

引用此