Video summarization via block sparse dictionary selection

Mingyang Ma; Shaohui Mei; Shuai Wan; Junhui Hou; Zhiyong Wang; David Dagan Feng

doi:10.1016/j.neucom.2019.07.108

Video summarization via block sparse dictionary selection

Mingyang Ma, Shaohui Mei, Shuai Wan, Junhui Hou, Zhiyong Wang, David Dagan Feng

School of Electronics and Information

Research output: Contribution to journal › Article › peer-review

71 Scopus citations

Abstract

The explosive growth of video data has raised new challenges for many video processing tasks such as video browsing and retrieval, hence, effective and efficient video summarization (VS) is urgently demanded to automatically summarize a video into a succinct version. Recent years have witnessed the advancements of sparse representation based approaches for VS. However, video frames are analyzed individually for keyframe selection in existing methods, which could lead to redundancy among selected keyframes and poor robustness to outlier frames. Due to that adjacent frames are visually similar, candidate keyframes often occur in temporal blocks, in addition to sparse presence. Therefore, in this paper, the block-sparsity of candidate keyframes is taken into consideration, by which the VS problem is formulated as a block sparse dictionary selection model. Moreover, a simultaneous block version of Orthogonal Matching Pursuit (SBOMP) algorithm is designed for model optimization. Two keyframe selection strategies are also explored for each block. Experimental results on two benchmark datasets, namely VSumm and TVSum datasets, demonstrate that the proposed SBOMP based VS method clearly outperforms several state-of-the-art sparse representation based methods in terms of F-score, redundancy among keyframes and robustness to outlier frames.

Original language	English
Pages (from-to)	197-209
Number of pages	13
Journal	Neurocomputing
Volume	378
DOIs	https://doi.org/10.1016/j.neucom.2019.07.108
State	Published - 22 Feb 2020

Keywords

Block-sparsity
Dictionary selection
Sparse representation
Video summarization

Access to Document

10.1016/j.neucom.2019.07.108

Cite this

@article{34b80ab91bd7463ca7ebc744f98216bf,

title = "Video summarization via block sparse dictionary selection",

abstract = "The explosive growth of video data has raised new challenges for many video processing tasks such as video browsing and retrieval, hence, effective and efficient video summarization (VS) is urgently demanded to automatically summarize a video into a succinct version. Recent years have witnessed the advancements of sparse representation based approaches for VS. However, video frames are analyzed individually for keyframe selection in existing methods, which could lead to redundancy among selected keyframes and poor robustness to outlier frames. Due to that adjacent frames are visually similar, candidate keyframes often occur in temporal blocks, in addition to sparse presence. Therefore, in this paper, the block-sparsity of candidate keyframes is taken into consideration, by which the VS problem is formulated as a block sparse dictionary selection model. Moreover, a simultaneous block version of Orthogonal Matching Pursuit (SBOMP) algorithm is designed for model optimization. Two keyframe selection strategies are also explored for each block. Experimental results on two benchmark datasets, namely VSumm and TVSum datasets, demonstrate that the proposed SBOMP based VS method clearly outperforms several state-of-the-art sparse representation based methods in terms of F-score, redundancy among keyframes and robustness to outlier frames.",

keywords = "Block-sparsity, Dictionary selection, Sparse representation, Video summarization",

author = "Mingyang Ma and Shaohui Mei and Shuai Wan and Junhui Hou and Zhiyong Wang and Feng, {David Dagan}",

note = "Publisher Copyright: {\textcopyright} 2019 Elsevier B.V.",

year = "2020",

month = feb,

day = "22",

doi = "10.1016/j.neucom.2019.07.108",

language = "英语",

volume = "378",

pages = "197--209",

journal = "Neurocomputing",

issn = "0925-2312",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Video summarization via block sparse dictionary selection

AU - Ma, Mingyang

AU - Mei, Shaohui

AU - Wan, Shuai

AU - Hou, Junhui

AU - Wang, Zhiyong

AU - Feng, David Dagan

PY - 2020/2/22

Y1 - 2020/2/22

N2 - The explosive growth of video data has raised new challenges for many video processing tasks such as video browsing and retrieval, hence, effective and efficient video summarization (VS) is urgently demanded to automatically summarize a video into a succinct version. Recent years have witnessed the advancements of sparse representation based approaches for VS. However, video frames are analyzed individually for keyframe selection in existing methods, which could lead to redundancy among selected keyframes and poor robustness to outlier frames. Due to that adjacent frames are visually similar, candidate keyframes often occur in temporal blocks, in addition to sparse presence. Therefore, in this paper, the block-sparsity of candidate keyframes is taken into consideration, by which the VS problem is formulated as a block sparse dictionary selection model. Moreover, a simultaneous block version of Orthogonal Matching Pursuit (SBOMP) algorithm is designed for model optimization. Two keyframe selection strategies are also explored for each block. Experimental results on two benchmark datasets, namely VSumm and TVSum datasets, demonstrate that the proposed SBOMP based VS method clearly outperforms several state-of-the-art sparse representation based methods in terms of F-score, redundancy among keyframes and robustness to outlier frames.

AB - The explosive growth of video data has raised new challenges for many video processing tasks such as video browsing and retrieval, hence, effective and efficient video summarization (VS) is urgently demanded to automatically summarize a video into a succinct version. Recent years have witnessed the advancements of sparse representation based approaches for VS. However, video frames are analyzed individually for keyframe selection in existing methods, which could lead to redundancy among selected keyframes and poor robustness to outlier frames. Due to that adjacent frames are visually similar, candidate keyframes often occur in temporal blocks, in addition to sparse presence. Therefore, in this paper, the block-sparsity of candidate keyframes is taken into consideration, by which the VS problem is formulated as a block sparse dictionary selection model. Moreover, a simultaneous block version of Orthogonal Matching Pursuit (SBOMP) algorithm is designed for model optimization. Two keyframe selection strategies are also explored for each block. Experimental results on two benchmark datasets, namely VSumm and TVSum datasets, demonstrate that the proposed SBOMP based VS method clearly outperforms several state-of-the-art sparse representation based methods in terms of F-score, redundancy among keyframes and robustness to outlier frames.

KW - Block-sparsity

KW - Dictionary selection

KW - Sparse representation

KW - Video summarization

UR - http://www.scopus.com/inward/record.url?scp=85074516313&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2019.07.108

DO - 10.1016/j.neucom.2019.07.108

M3 - 文章

AN - SCOPUS:85074516313

SN - 0925-2312

VL - 378

SP - 197

EP - 209

JO - Neurocomputing

JF - Neurocomputing

ER -

Video summarization via block sparse dictionary selection

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this