A top-down approach for video summarization

Genliang Guan; Zhiyong Wang; Shaohui Mei; Max Ott; Mingyi He; David Dagan Feng

doi:10.1145/2632267

A top-down approach for video summarization

Genliang Guan, Zhiyong Wang, Shaohui Mei, Max Ott, Mingyi He, David Dagan Feng

电子信息学院

科研成果: 期刊稿件 › 文章 › 同行评审

40 引用（Scopus）

摘要

While most existing video summarization approaches aim to identify important frames of a video from either a global or local perspective, we propose a top-down approach consisting of scene identification and scene summarization. For scene identification, we represent each frame with global features and utilize a scalable clustering method.We then formulate scene summarization as choosing those frames that best cover a set of local descriptors with minimal redundancy. In addition, we develop a visual word-based approach to make our approach more computationally scalable. Experimental results on two benchmark datasets demonstrate that our proposed approach clearly outperforms the state-of-the-art.

源语言	英语
文章编号	4
期刊	ACM Transactions on Multimedia Computing, Communications and Applications
卷	11
期	1
DOI	https://doi.org/10.1145/2632267
出版状态	已出版 - 8月 2014

访问文件

10.1145/2632267

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{f4abc359a24c41c08d8e2d3fbe21e2da,

title = "A top-down approach for video summarization",

abstract = "While most existing video summarization approaches aim to identify important frames of a video from either a global or local perspective, we propose a top-down approach consisting of scene identification and scene summarization. For scene identification, we represent each frame with global features and utilize a scalable clustering method.We then formulate scene summarization as choosing those frames that best cover a set of local descriptors with minimal redundancy. In addition, we develop a visual word-based approach to make our approach more computationally scalable. Experimental results on two benchmark datasets demonstrate that our proposed approach clearly outperforms the state-of-the-art.",

keywords = "Clustering, Keyframe extraction, Keypoint, Local visual word, Scene identification",

author = "Genliang Guan and Zhiyong Wang and Shaohui Mei and Max Ott and Mingyi He and Feng, {David Dagan}",

year = "2014",

month = aug,

doi = "10.1145/2632267",

language = "英语",

volume = "11",

journal = "ACM Transactions on Multimedia Computing, Communications and Applications",

issn = "1551-6857",

publisher = "Association for Computing Machinery (ACM)",

number = "1",

}

TY - JOUR

T1 - A top-down approach for video summarization

AU - Guan, Genliang

AU - Wang, Zhiyong

AU - Mei, Shaohui

AU - Ott, Max

AU - He, Mingyi

AU - Feng, David Dagan

PY - 2014/8

Y1 - 2014/8

N2 - While most existing video summarization approaches aim to identify important frames of a video from either a global or local perspective, we propose a top-down approach consisting of scene identification and scene summarization. For scene identification, we represent each frame with global features and utilize a scalable clustering method.We then formulate scene summarization as choosing those frames that best cover a set of local descriptors with minimal redundancy. In addition, we develop a visual word-based approach to make our approach more computationally scalable. Experimental results on two benchmark datasets demonstrate that our proposed approach clearly outperforms the state-of-the-art.

AB - While most existing video summarization approaches aim to identify important frames of a video from either a global or local perspective, we propose a top-down approach consisting of scene identification and scene summarization. For scene identification, we represent each frame with global features and utilize a scalable clustering method.We then formulate scene summarization as choosing those frames that best cover a set of local descriptors with minimal redundancy. In addition, we develop a visual word-based approach to make our approach more computationally scalable. Experimental results on two benchmark datasets demonstrate that our proposed approach clearly outperforms the state-of-the-art.

KW - Clustering

KW - Keyframe extraction

KW - Keypoint

KW - Local visual word

KW - Scene identification

UR - http://www.scopus.com/inward/record.url?scp=84906857262&partnerID=8YFLogxK

U2 - 10.1145/2632267

DO - 10.1145/2632267

M3 - 文章

AN - SCOPUS:84906857262

SN - 1551-6857

VL - 11

JO - ACM Transactions on Multimedia Computing, Communications and Applications

JF - ACM Transactions on Multimedia Computing, Communications and Applications

IS - 1

M1 - 4

ER -

A top-down approach for video summarization

摘要

访问文件

其它文件与链接

指纹

引用此