HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization

Bin Zhao; Xuelong Li; Xiaoqiang Lu

doi:10.1109/CVPR.2018.00773

HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization

Bin Zhao, Xuelong Li, Xiaoqiang Lu

光电与智能研究院

CAS - Xi'an Institute of Optics and Precision Mechanics

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

205 引用（Scopus）

摘要

Although video summarization has achieved great success in recent years, few approaches have realized the influence of video structure on the summarization results. As we know, the video data follow a hierarchical structure, i.e., a video is composed of shots, and a shot is composed of several frames. Generally, shots provide the activity-level information for people to understand the video content. While few existing summarization approaches pay attention to the shot segmentation procedure. They generate shots by some trivial strategies, such as fixed length segmentation, which may destroy the underlying hierarchical structure of video data and further reduce the quality of generated summaries. To address this problem, we propose a structure-adaptive video summarization approach that integrates shot segmentation and video summarization into a Hierarchical Structure-Adaptive RNN, denoted as HSA-RNN. We evaluate the proposed approach on four popular datasets, i.e., SumMe, TVsum, CoSum and VTW. The experimental results have demonstrated the effectiveness of HSA-RNN in the video summarization task.

源语言	英语
主期刊名	Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
出版商	IEEE Computer Society
页	7405-7414
页数	10
ISBN（电子版）	9781538664209
DOI	https://doi.org/10.1109/CVPR.2018.00773
出版状态	已出版 - 14 12月 2018
活动	31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 - Salt Lake City, 美国期限: 18 6月 2018 → 22 6月 2018

出版系列

姓名	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN（印刷版）	1063-6919

会议

会议	31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
国家/地区	美国
市	Salt Lake City
时期	18/06/18 → 22/06/18

访问文件

10.1109/CVPR.2018.00773

其它文件与链接

链接到 Scopus 的出版物

引用此

Zhao, B., Li, X., & Lu, X. (2018). HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization. 在 Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 (页码 7405-7414). 文章 8578871 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). IEEE Computer Society. https://doi.org/10.1109/CVPR.2018.00773

@inproceedings{7943be1db87f4b978501c7143b71c42a,

title = "HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization",

abstract = "Although video summarization has achieved great success in recent years, few approaches have realized the influence of video structure on the summarization results. As we know, the video data follow a hierarchical structure, i.e., a video is composed of shots, and a shot is composed of several frames. Generally, shots provide the activity-level information for people to understand the video content. While few existing summarization approaches pay attention to the shot segmentation procedure. They generate shots by some trivial strategies, such as fixed length segmentation, which may destroy the underlying hierarchical structure of video data and further reduce the quality of generated summaries. To address this problem, we propose a structure-adaptive video summarization approach that integrates shot segmentation and video summarization into a Hierarchical Structure-Adaptive RNN, denoted as HSA-RNN. We evaluate the proposed approach on four popular datasets, i.e., SumMe, TVsum, CoSum and VTW. The experimental results have demonstrated the effectiveness of HSA-RNN in the video summarization task.",

author = "Bin Zhao and Xuelong Li and Xiaoqiang Lu",

note = "Publisher Copyright: {\textcopyright} 2018 IEEE.; 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 ; Conference date: 18-06-2018 Through 22-06-2018",

year = "2018",

month = dec,

day = "14",

doi = "10.1109/CVPR.2018.00773",

language = "英语",

series = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

publisher = "IEEE Computer Society",

pages = "7405--7414",

booktitle = "Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018",

}

Zhao, B, Li, X & Lu, X 2018, HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization. 在 Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018., 8578871, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, 页码 7405-7414, 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, 美国, 18/06/18. https://doi.org/10.1109/CVPR.2018.00773

HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization. / Zhao, Bin; Li, Xuelong; Lu, Xiaoqiang.
Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018. IEEE Computer Society, 2018. 页码 7405-7414 8578871 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - HSA-RNN

T2 - 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018

AU - Zhao, Bin

AU - Li, Xuelong

AU - Lu, Xiaoqiang

PY - 2018/12/14

Y1 - 2018/12/14

N2 - Although video summarization has achieved great success in recent years, few approaches have realized the influence of video structure on the summarization results. As we know, the video data follow a hierarchical structure, i.e., a video is composed of shots, and a shot is composed of several frames. Generally, shots provide the activity-level information for people to understand the video content. While few existing summarization approaches pay attention to the shot segmentation procedure. They generate shots by some trivial strategies, such as fixed length segmentation, which may destroy the underlying hierarchical structure of video data and further reduce the quality of generated summaries. To address this problem, we propose a structure-adaptive video summarization approach that integrates shot segmentation and video summarization into a Hierarchical Structure-Adaptive RNN, denoted as HSA-RNN. We evaluate the proposed approach on four popular datasets, i.e., SumMe, TVsum, CoSum and VTW. The experimental results have demonstrated the effectiveness of HSA-RNN in the video summarization task.

AB - Although video summarization has achieved great success in recent years, few approaches have realized the influence of video structure on the summarization results. As we know, the video data follow a hierarchical structure, i.e., a video is composed of shots, and a shot is composed of several frames. Generally, shots provide the activity-level information for people to understand the video content. While few existing summarization approaches pay attention to the shot segmentation procedure. They generate shots by some trivial strategies, such as fixed length segmentation, which may destroy the underlying hierarchical structure of video data and further reduce the quality of generated summaries. To address this problem, we propose a structure-adaptive video summarization approach that integrates shot segmentation and video summarization into a Hierarchical Structure-Adaptive RNN, denoted as HSA-RNN. We evaluate the proposed approach on four popular datasets, i.e., SumMe, TVsum, CoSum and VTW. The experimental results have demonstrated the effectiveness of HSA-RNN in the video summarization task.

UR - http://www.scopus.com/inward/record.url?scp=85062862552&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2018.00773

DO - 10.1109/CVPR.2018.00773

M3 - 会议稿件

AN - SCOPUS:85062862552

T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SP - 7405

EP - 7414

BT - Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018

PB - IEEE Computer Society

Y2 - 18 June 2018 through 22 June 2018

ER -

Zhao B, Li X, Lu X. HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization. 在 Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018. IEEE Computer Society. 2018. 页码 7405-7414. 8578871. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). doi: 10.1109/CVPR.2018.00773

HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此