Category driven deep recurrent neural network for video summarization

Song Xinhui; Ke Chen; Jie Lei; Li Sun; Zhiyuan Wang; Lei Xie; Song Mingli

doi:10.1109/ICMEW.2016.7574720

Category driven deep recurrent neural network for video summarization

Song Xinhui, Ke Chen, Jie Lei, Li Sun, Zhiyuan Wang, Lei Xie, Song Mingli

计算机学院

Zhejiang University

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

22 引用（Scopus）

摘要

A large number of videos are generated and uploaded to video websites (like youku, youtube) every day and video websites play more and more important roles in human life. While bringing convenience, the big video data raise the difficulty of video summarization to allow users to browse a video easily. However, although there are many existing video summarization approaches, the key frames selected fail to integrate the large video contexts and the qualities of the summarized results are difficult to evaluate because of the lack of ground-truth. Inspired by the previous methods that extract key frames, we propose a deep recurrent neural network model, which learns to extract category-driven key frames. First, we sequentially extract a fixed number of key frames using time-dependent location networks. Second, we utilize recurrent neural network to integrate information of the key frames to classify the category of the video. Therefore, the quality of the extracted key frames could be evaluated by the categorization accuracy. Experiments on a 500-video dataset show that the proposed scheme extracts reasonable key frames and outperforms other methods by quantitative evaluation.

源语言	英语
主期刊名	2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016
出版商	Institute of Electrical and Electronics Engineers Inc.
ISBN（电子版）	9781509015528
DOI	https://doi.org/10.1109/ICMEW.2016.7574720
出版状态	已出版 - 22 9月 2016
活动	2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016 - Seattle, 美国期限: 11 7月 2016 → 15 7月 2016

出版系列

姓名	2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016

会议

会议	2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016
国家/地区	美国
市	Seattle
时期	11/07/16 → 15/07/16

访问文件

10.1109/ICMEW.2016.7574720

其它文件与链接

链接到 Scopus 的出版物

引用此

Xinhui, S., Chen, K., Lei, J., Sun, L., Wang, Z., Xie, L., & Mingli, S. (2016). Category driven deep recurrent neural network for video summarization. 在 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016 文章 7574720 (2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICMEW.2016.7574720

@inproceedings{1831b6169b474b9cb6a0fb1746e18f52,

title = "Category driven deep recurrent neural network for video summarization",

abstract = "A large number of videos are generated and uploaded to video websites (like youku, youtube) every day and video websites play more and more important roles in human life. While bringing convenience, the big video data raise the difficulty of video summarization to allow users to browse a video easily. However, although there are many existing video summarization approaches, the key frames selected fail to integrate the large video contexts and the qualities of the summarized results are difficult to evaluate because of the lack of ground-truth. Inspired by the previous methods that extract key frames, we propose a deep recurrent neural network model, which learns to extract category-driven key frames. First, we sequentially extract a fixed number of key frames using time-dependent location networks. Second, we utilize recurrent neural network to integrate information of the key frames to classify the category of the video. Therefore, the quality of the extracted key frames could be evaluated by the categorization accuracy. Experiments on a 500-video dataset show that the proposed scheme extracts reasonable key frames and outperforms other methods by quantitative evaluation.",

keywords = "Recurrent video summarization, Reinforcement learning, Video categorization",

author = "Song Xinhui and Ke Chen and Jie Lei and Li Sun and Zhiyuan Wang and Lei Xie and Song Mingli",

note = "Publisher Copyright: {\textcopyright} 2016 IEEE.; 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016 ; Conference date: 11-07-2016 Through 15-07-2016",

year = "2016",

month = sep,

day = "22",

doi = "10.1109/ICMEW.2016.7574720",

language = "英语",

series = "2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016",

}

Xinhui, S, Chen, K, Lei, J, Sun, L, Wang, Z, Xie, L & Mingli, S 2016, Category driven deep recurrent neural network for video summarization. 在 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016., 7574720, 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016, Institute of Electrical and Electronics Engineers Inc., 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016, Seattle, 美国, 11/07/16. https://doi.org/10.1109/ICMEW.2016.7574720

Category driven deep recurrent neural network for video summarization. / Xinhui, Song; Chen, Ke; Lei, Jie 等.
2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016. Institute of Electrical and Electronics Engineers Inc., 2016. 7574720 (2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Category driven deep recurrent neural network for video summarization

AU - Xinhui, Song

AU - Chen, Ke

AU - Lei, Jie

AU - Sun, Li

AU - Wang, Zhiyuan

AU - Xie, Lei

AU - Mingli, Song

PY - 2016/9/22

Y1 - 2016/9/22

N2 - A large number of videos are generated and uploaded to video websites (like youku, youtube) every day and video websites play more and more important roles in human life. While bringing convenience, the big video data raise the difficulty of video summarization to allow users to browse a video easily. However, although there are many existing video summarization approaches, the key frames selected fail to integrate the large video contexts and the qualities of the summarized results are difficult to evaluate because of the lack of ground-truth. Inspired by the previous methods that extract key frames, we propose a deep recurrent neural network model, which learns to extract category-driven key frames. First, we sequentially extract a fixed number of key frames using time-dependent location networks. Second, we utilize recurrent neural network to integrate information of the key frames to classify the category of the video. Therefore, the quality of the extracted key frames could be evaluated by the categorization accuracy. Experiments on a 500-video dataset show that the proposed scheme extracts reasonable key frames and outperforms other methods by quantitative evaluation.

AB - A large number of videos are generated and uploaded to video websites (like youku, youtube) every day and video websites play more and more important roles in human life. While bringing convenience, the big video data raise the difficulty of video summarization to allow users to browse a video easily. However, although there are many existing video summarization approaches, the key frames selected fail to integrate the large video contexts and the qualities of the summarized results are difficult to evaluate because of the lack of ground-truth. Inspired by the previous methods that extract key frames, we propose a deep recurrent neural network model, which learns to extract category-driven key frames. First, we sequentially extract a fixed number of key frames using time-dependent location networks. Second, we utilize recurrent neural network to integrate information of the key frames to classify the category of the video. Therefore, the quality of the extracted key frames could be evaluated by the categorization accuracy. Experiments on a 500-video dataset show that the proposed scheme extracts reasonable key frames and outperforms other methods by quantitative evaluation.

KW - Recurrent video summarization

KW - Reinforcement learning

KW - Video categorization

UR - http://www.scopus.com/inward/record.url?scp=84992088974&partnerID=8YFLogxK

U2 - 10.1109/ICMEW.2016.7574720

DO - 10.1109/ICMEW.2016.7574720

M3 - 会议稿件

AN - SCOPUS:84992088974

T3 - 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016

BT - 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016

Y2 - 11 July 2016 through 15 July 2016

ER -

Xinhui S, Chen K, Lei J, Sun L, Wang Z, Xie L 等. Category driven deep recurrent neural network for video summarization. 在 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016. Institute of Electrical and Electronics Engineers Inc. 2016. 7574720. (2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016). doi: 10.1109/ICMEW.2016.7574720

Category driven deep recurrent neural network for video summarization

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此