User-Ranking Video Summarization with Multi-Stage Spatio-Temporal Representation

Siyu Huang, Xi Li, Zhongfei Zhang, Fei Wu, Junwei Han

科研成果: 期刊稿件文章同行评审

49 引用 (Scopus)

摘要

Video summarization is a challenging task, mainly due to the difficulties in learning complicated semantic structural relations between videos and summaries. In this paper, we present a novel supervised video summarization scheme based on three-stage deep neural networks. The scheme takes a divide-And-conquer strategy to resolve the complicated task of 3D video summarization into a set of easy and flexible computational subtasks, and then to sequentially perform 2D CNNs, 1D CNNs, and long short-Term memory to address the subtasks in an hierarchical fashion. The hierarchical modeling of spatio-Temporal structure leads to high performance and efficiency. In addition, we propose a simple but effective user-ranking method to cope with the labeling subjectivity problem of user-created video summarization, leading to the labeling quality refinement for robust supervised learning. Experimental results show that our approach outperforms the state-of-The-Art video summarization methods on two benchmark datasets.

源语言英语
文章编号8585041
页(从-至)2654-2664
页数11
期刊IEEE Transactions on Image Processing
28
6
DOI
出版状态已出版 - 6月 2019

指纹

探究 'User-Ranking Video Summarization with Multi-Stage Spatio-Temporal Representation' 的科研主题。它们共同构成独一无二的指纹。

引用此