TY - JOUR
T1 - User-Ranking Video Summarization with Multi-Stage Spatio-Temporal Representation
AU - Huang, Siyu
AU - Li, Xi
AU - Zhang, Zhongfei
AU - Wu, Fei
AU - Han, Junwei
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2019/6
Y1 - 2019/6
N2 - Video summarization is a challenging task, mainly due to the difficulties in learning complicated semantic structural relations between videos and summaries. In this paper, we present a novel supervised video summarization scheme based on three-stage deep neural networks. The scheme takes a divide-And-conquer strategy to resolve the complicated task of 3D video summarization into a set of easy and flexible computational subtasks, and then to sequentially perform 2D CNNs, 1D CNNs, and long short-Term memory to address the subtasks in an hierarchical fashion. The hierarchical modeling of spatio-Temporal structure leads to high performance and efficiency. In addition, we propose a simple but effective user-ranking method to cope with the labeling subjectivity problem of user-created video summarization, leading to the labeling quality refinement for robust supervised learning. Experimental results show that our approach outperforms the state-of-The-Art video summarization methods on two benchmark datasets.
AB - Video summarization is a challenging task, mainly due to the difficulties in learning complicated semantic structural relations between videos and summaries. In this paper, we present a novel supervised video summarization scheme based on three-stage deep neural networks. The scheme takes a divide-And-conquer strategy to resolve the complicated task of 3D video summarization into a set of easy and flexible computational subtasks, and then to sequentially perform 2D CNNs, 1D CNNs, and long short-Term memory to address the subtasks in an hierarchical fashion. The hierarchical modeling of spatio-Temporal structure leads to high performance and efficiency. In addition, we propose a simple but effective user-ranking method to cope with the labeling subjectivity problem of user-created video summarization, leading to the labeling quality refinement for robust supervised learning. Experimental results show that our approach outperforms the state-of-The-Art video summarization methods on two benchmark datasets.
KW - convolutional neural network
KW - multi-user inconsistency
KW - recurrent neural network
KW - user ranking
KW - Video summarization
UR - http://www.scopus.com/inward/record.url?scp=85058992212&partnerID=8YFLogxK
U2 - 10.1109/TIP.2018.2889265
DO - 10.1109/TIP.2018.2889265
M3 - 文章
AN - SCOPUS:85058992212
SN - 1057-7149
VL - 28
SP - 2654
EP - 2664
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
IS - 6
M1 - 8585041
ER -