Statistical quantization for similarity search

Qi Wang; Guokang Zhu; Yuan Yuan

doi:10.1016/j.cviu.2014.03.002

Statistical quantization for similarity search

Qi Wang, Guokang Zhu, Yuan Yuan

Research output: Contribution to journal › Article › peer-review

21 Scopus citations

Abstract

Approximate nearest neighbor search has attracted much attention recently, which allows for fast query with a predictable sacrifice in search quality. Among the related works, k-means quantizers are possibly the most adaptive methods, and have shown the superiority on search accuracy than the others. However, a common problem shared by the traditional quantizers is that during the out-of-sample extension process, the naive strategy considers only the similarities in Euclidean space without taking into account the statistical and geometrical properties of the data. To cope with this problem, in this paper a novel approach is proposed by formulating a generalized likelihood ratio analysis. In particular, the proposed method takes a physically meaningful discrimination on the affiliations of the new samples with respect to the obtained Voronoi cells. This discrimination essentially imposes the measure of statistical consistency on out-of-sample extension. The experimental studies on two large data sets show that the proposed method is more effective than the benchmark algorithms.

Original language	English
Pages (from-to)	22-30
Number of pages	9
Journal	Computer Vision and Image Understanding
Volume	124
DOIs	https://doi.org/10.1016/j.cviu.2014.03.002
State	Published - Jul 2014
Externally published	Yes

Keywords

Binary code
Computer vision
Hashing
Machine learning
Quantization
Similarity search

Access to Document

10.1016/j.cviu.2014.03.002

Cite this

@article{314a9e4c377a4f9eb46453246a0d4a18,

title = "Statistical quantization for similarity search",

abstract = "Approximate nearest neighbor search has attracted much attention recently, which allows for fast query with a predictable sacrifice in search quality. Among the related works, k-means quantizers are possibly the most adaptive methods, and have shown the superiority on search accuracy than the others. However, a common problem shared by the traditional quantizers is that during the out-of-sample extension process, the naive strategy considers only the similarities in Euclidean space without taking into account the statistical and geometrical properties of the data. To cope with this problem, in this paper a novel approach is proposed by formulating a generalized likelihood ratio analysis. In particular, the proposed method takes a physically meaningful discrimination on the affiliations of the new samples with respect to the obtained Voronoi cells. This discrimination essentially imposes the measure of statistical consistency on out-of-sample extension. The experimental studies on two large data sets show that the proposed method is more effective than the benchmark algorithms.",

keywords = "Binary code, Computer vision, Hashing, Machine learning, Quantization, Similarity search",

author = "Qi Wang and Guokang Zhu and Yuan Yuan",

year = "2014",

month = jul,

doi = "10.1016/j.cviu.2014.03.002",

language = "英语",

volume = "124",

pages = "22--30",

journal = "Computer Vision and Image Understanding",

issn = "1077-3142",

publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Statistical quantization for similarity search

AU - Wang, Qi

AU - Zhu, Guokang

AU - Yuan, Yuan

PY - 2014/7

Y1 - 2014/7

N2 - Approximate nearest neighbor search has attracted much attention recently, which allows for fast query with a predictable sacrifice in search quality. Among the related works, k-means quantizers are possibly the most adaptive methods, and have shown the superiority on search accuracy than the others. However, a common problem shared by the traditional quantizers is that during the out-of-sample extension process, the naive strategy considers only the similarities in Euclidean space without taking into account the statistical and geometrical properties of the data. To cope with this problem, in this paper a novel approach is proposed by formulating a generalized likelihood ratio analysis. In particular, the proposed method takes a physically meaningful discrimination on the affiliations of the new samples with respect to the obtained Voronoi cells. This discrimination essentially imposes the measure of statistical consistency on out-of-sample extension. The experimental studies on two large data sets show that the proposed method is more effective than the benchmark algorithms.

AB - Approximate nearest neighbor search has attracted much attention recently, which allows for fast query with a predictable sacrifice in search quality. Among the related works, k-means quantizers are possibly the most adaptive methods, and have shown the superiority on search accuracy than the others. However, a common problem shared by the traditional quantizers is that during the out-of-sample extension process, the naive strategy considers only the similarities in Euclidean space without taking into account the statistical and geometrical properties of the data. To cope with this problem, in this paper a novel approach is proposed by formulating a generalized likelihood ratio analysis. In particular, the proposed method takes a physically meaningful discrimination on the affiliations of the new samples with respect to the obtained Voronoi cells. This discrimination essentially imposes the measure of statistical consistency on out-of-sample extension. The experimental studies on two large data sets show that the proposed method is more effective than the benchmark algorithms.

KW - Binary code

KW - Computer vision

KW - Hashing

KW - Machine learning

KW - Quantization

KW - Similarity search

UR - http://www.scopus.com/inward/record.url?scp=84901921955&partnerID=8YFLogxK

U2 - 10.1016/j.cviu.2014.03.002

DO - 10.1016/j.cviu.2014.03.002

M3 - 文章

AN - SCOPUS:84901921955

SN - 1077-3142

VL - 124

SP - 22

EP - 30

JO - Computer Vision and Image Understanding

JF - Computer Vision and Image Understanding

ER -

Statistical quantization for similarity search

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this