Key Frame Extraction in the Summary Space

Xuelong Li; Bin Zhao; Xiaoqiang Lu

doi:10.1109/TCYB.2017.2718579

Key Frame Extraction in the Summary Space

Xuelong Li, Bin Zhao, Xiaoqiang Lu

School of Artificial Intelligence, OPtics and Electronics

CAS - Xi'an Institute of Optics and Precision Mechanics

Research output: Contribution to journal › Article › peer-review

46 Scopus citations

Abstract

Key frame extraction is an efficient way to create the video summary which helps users obtain a quick comprehension of the video content. Generally, the key frames should be representative of the video content, meanwhile, diverse to reduce the redundancy. Based on the assumption that the video data are near a subspace of a high-dimensional space, a new approach, named as key frame extraction in the summary space, is proposed for key frame extraction in this paper. The proposed approach aims to find the representative frames of the video and filter out similar frames from the representative frame set. First of all, the video data are mapped to a high-dimensional space, named as summary space. Then, a new representation is learned for each frame by analyzing the intrinsic structure of the summary space. Specifically, the learned representation can reflect the representativeness of the frame, and is utilized to select representative frames. Next, the perceptual hash algorithm is employed to measure the similarity of representative frames. As a result, the key frame set is obtained after filtering out similar frames from the representative frame set. Finally, the video summary is constructed by assigning the key frames in temporal order. Additionally, the ground truth, created by filtering out similar frames from human-created summaries, is utilized to evaluate the quality of the video summary. Compared with several traditional approaches, the experimental results on 80 videos from two datasets indicate the superior performance of our approach.

Original language	English
Pages (from-to)	1923-1934
Number of pages	12
Journal	IEEE Transactions on Cybernetics
Volume	48
Issue number	6
DOIs	https://doi.org/10.1109/TCYB.2017.2718579
State	Published - Jun 2018

Keywords

Diverse
key frame
representative
summary space

Access to Document

10.1109/TCYB.2017.2718579

Cite this

@article{02dfd2f399924075bd6f8702c0438f64,

title = "Key Frame Extraction in the Summary Space",

abstract = "Key frame extraction is an efficient way to create the video summary which helps users obtain a quick comprehension of the video content. Generally, the key frames should be representative of the video content, meanwhile, diverse to reduce the redundancy. Based on the assumption that the video data are near a subspace of a high-dimensional space, a new approach, named as key frame extraction in the summary space, is proposed for key frame extraction in this paper. The proposed approach aims to find the representative frames of the video and filter out similar frames from the representative frame set. First of all, the video data are mapped to a high-dimensional space, named as summary space. Then, a new representation is learned for each frame by analyzing the intrinsic structure of the summary space. Specifically, the learned representation can reflect the representativeness of the frame, and is utilized to select representative frames. Next, the perceptual hash algorithm is employed to measure the similarity of representative frames. As a result, the key frame set is obtained after filtering out similar frames from the representative frame set. Finally, the video summary is constructed by assigning the key frames in temporal order. Additionally, the ground truth, created by filtering out similar frames from human-created summaries, is utilized to evaluate the quality of the video summary. Compared with several traditional approaches, the experimental results on 80 videos from two datasets indicate the superior performance of our approach.",

keywords = "Diverse, key frame, representative, summary space",

author = "Xuelong Li and Bin Zhao and Xiaoqiang Lu",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2018",

month = jun,

doi = "10.1109/TCYB.2017.2718579",

language = "英语",

volume = "48",

pages = "1923--1934",

journal = "IEEE Transactions on Cybernetics",

issn = "2168-2267",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "6",

}

TY - JOUR

T1 - Key Frame Extraction in the Summary Space

AU - Li, Xuelong

AU - Zhao, Bin

AU - Lu, Xiaoqiang

PY - 2018/6

Y1 - 2018/6

N2 - Key frame extraction is an efficient way to create the video summary which helps users obtain a quick comprehension of the video content. Generally, the key frames should be representative of the video content, meanwhile, diverse to reduce the redundancy. Based on the assumption that the video data are near a subspace of a high-dimensional space, a new approach, named as key frame extraction in the summary space, is proposed for key frame extraction in this paper. The proposed approach aims to find the representative frames of the video and filter out similar frames from the representative frame set. First of all, the video data are mapped to a high-dimensional space, named as summary space. Then, a new representation is learned for each frame by analyzing the intrinsic structure of the summary space. Specifically, the learned representation can reflect the representativeness of the frame, and is utilized to select representative frames. Next, the perceptual hash algorithm is employed to measure the similarity of representative frames. As a result, the key frame set is obtained after filtering out similar frames from the representative frame set. Finally, the video summary is constructed by assigning the key frames in temporal order. Additionally, the ground truth, created by filtering out similar frames from human-created summaries, is utilized to evaluate the quality of the video summary. Compared with several traditional approaches, the experimental results on 80 videos from two datasets indicate the superior performance of our approach.

AB - Key frame extraction is an efficient way to create the video summary which helps users obtain a quick comprehension of the video content. Generally, the key frames should be representative of the video content, meanwhile, diverse to reduce the redundancy. Based on the assumption that the video data are near a subspace of a high-dimensional space, a new approach, named as key frame extraction in the summary space, is proposed for key frame extraction in this paper. The proposed approach aims to find the representative frames of the video and filter out similar frames from the representative frame set. First of all, the video data are mapped to a high-dimensional space, named as summary space. Then, a new representation is learned for each frame by analyzing the intrinsic structure of the summary space. Specifically, the learned representation can reflect the representativeness of the frame, and is utilized to select representative frames. Next, the perceptual hash algorithm is employed to measure the similarity of representative frames. As a result, the key frame set is obtained after filtering out similar frames from the representative frame set. Finally, the video summary is constructed by assigning the key frames in temporal order. Additionally, the ground truth, created by filtering out similar frames from human-created summaries, is utilized to evaluate the quality of the video summary. Compared with several traditional approaches, the experimental results on 80 videos from two datasets indicate the superior performance of our approach.

KW - Diverse

KW - key frame

KW - representative

KW - summary space

UR - http://www.scopus.com/inward/record.url?scp=85023169635&partnerID=8YFLogxK

U2 - 10.1109/TCYB.2017.2718579

DO - 10.1109/TCYB.2017.2718579

M3 - 文章

C2 - 28693004

AN - SCOPUS:85023169635

SN - 2168-2267

VL - 48

SP - 1923

EP - 1934

JO - IEEE Transactions on Cybernetics

JF - IEEE Transactions on Cybernetics

IS - 6

ER -

Key Frame Extraction in the Summary Space

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this