Key Frame Extraction in the Summary Space

Xuelong Li, Bin Zhao, Xiaoqiang Lu

Research output: Contribution to journalArticlepeer-review

44 Scopus citations

Abstract

Key frame extraction is an efficient way to create the video summary which helps users obtain a quick comprehension of the video content. Generally, the key frames should be representative of the video content, meanwhile, diverse to reduce the redundancy. Based on the assumption that the video data are near a subspace of a high-dimensional space, a new approach, named as key frame extraction in the summary space, is proposed for key frame extraction in this paper. The proposed approach aims to find the representative frames of the video and filter out similar frames from the representative frame set. First of all, the video data are mapped to a high-dimensional space, named as summary space. Then, a new representation is learned for each frame by analyzing the intrinsic structure of the summary space. Specifically, the learned representation can reflect the representativeness of the frame, and is utilized to select representative frames. Next, the perceptual hash algorithm is employed to measure the similarity of representative frames. As a result, the key frame set is obtained after filtering out similar frames from the representative frame set. Finally, the video summary is constructed by assigning the key frames in temporal order. Additionally, the ground truth, created by filtering out similar frames from human-created summaries, is utilized to evaluate the quality of the video summary. Compared with several traditional approaches, the experimental results on 80 videos from two datasets indicate the superior performance of our approach.

Original languageEnglish
Pages (from-to)1923-1934
Number of pages12
JournalIEEE Transactions on Cybernetics
Volume48
Issue number6
DOIs
StatePublished - Jun 2018

Keywords

  • Diverse
  • key frame
  • representative
  • summary space

Fingerprint

Dive into the research topics of 'Key Frame Extraction in the Summary Space'. Together they form a unique fingerprint.

Cite this