Locality constrained encoding of frequency and spatial information for image classification

Yongsheng Pan, Yong Xia, Yang Song, Weidong Cai

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

The bag-of-feature (BoF) model provides a way to construct high-level representation for image classification. Although spatial pyramid matching (SPM) has been incorporated into many of its extensions, these models intrinsically lack the mechanism to utilize frequency domain information. In this paper, we propose the locality-constrained encoding of frequency and spatial information (LEFSI) algorithm, in which an image is decomposed into multiple frequency components and each component is further decomposed into subregions using SPM. The scale-invariant feature transform (SIFT) descriptors are first calculated in each subregion, and then converted into a global descriptor by using the codebook generated on a category-by-category basis and locality-constrained linear coding (LLC). The image feature is defined as the concatenation of global descriptors constructed in all subregions. We evaluated this algorithm against several state-of-the-art models on six benchmark datasets. Our results suggest that the proposed LEFSI algorithm can describe images more effectively and provide more accurate image classification.

Original languageEnglish
Pages (from-to)24891-24907
Number of pages17
JournalMultimedia Tools and Applications
Volume77
Issue number19
DOIs
StatePublished - 1 Oct 2018

Keywords

  • Bag-of-features (BoF)
  • Image classification
  • Image decomposition
  • Spatial pyramid matching (SPM)
  • Wavelet transform

Fingerprint

Dive into the research topics of 'Locality constrained encoding of frequency and spatial information for image classification'. Together they form a unique fingerprint.

Cite this