Unsupervised Ensemble Hashing: Boosting Minimum Hamming Distance

Yufei Zha; Zhuling Qiu; Peng Zhang; Wei Huang

doi:10.1109/ACCESS.2020.2975883

Unsupervised Ensemble Hashing: Boosting Minimum Hamming Distance

Yufei Zha, Zhuling Qiu, Peng Zhang, Wei Huang

School of Computer Science

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

Hashing aims at learning discriminative binary codes of high-dimensional data for the approximate nearest neighbor searching. However, the distance ranking obtained by traditional methods is not optimum in the Hamming space, and it degrades the performance for retrieval tasks. To tackle the above problem, an unsupervised ensemble hashing is proposed to improve the ranking accuracy by assembling the diverse hash tables independently in this paper. We observe that the higher the accuracy is the larger diversity the base learner has, and the more effective the ensemble method is. Based on this principle, two special ensembles hashing approaches are proposed to increase diversity by bootstrap sampling with data-dependent methods. Especially, the results are better when the minimum Hamming distance is large and the variance of the Hamming distance is small. This proposed method is conducted in the experiments and the results show that it can achieve about 10%-25% performance compared with the baseline algorithm, which achieves competitive results with the state-of-the-art methods on the CIFAR-10 and LabelMe benchmarks.

Original language	English
Article number	9007381
Pages (from-to)	42937-42947
Number of pages	11
Journal	IEEE Access
Volume	8
DOIs	https://doi.org/10.1109/ACCESS.2020.2975883
State	Published - 2020

Keywords

Accuracy and diversity
Distance variance
Ensemble method
Hamming distance
Unsupervised hashing

Access to Document

10.1109/ACCESS.2020.2975883

Cite this

@article{5b363f300d8048b49cc5f6cf70391323,

title = "Unsupervised Ensemble Hashing: Boosting Minimum Hamming Distance",

abstract = "Hashing aims at learning discriminative binary codes of high-dimensional data for the approximate nearest neighbor searching. However, the distance ranking obtained by traditional methods is not optimum in the Hamming space, and it degrades the performance for retrieval tasks. To tackle the above problem, an unsupervised ensemble hashing is proposed to improve the ranking accuracy by assembling the diverse hash tables independently in this paper. We observe that the higher the accuracy is the larger diversity the base learner has, and the more effective the ensemble method is. Based on this principle, two special ensembles hashing approaches are proposed to increase diversity by bootstrap sampling with data-dependent methods. Especially, the results are better when the minimum Hamming distance is large and the variance of the Hamming distance is small. This proposed method is conducted in the experiments and the results show that it can achieve about 10%-25% performance compared with the baseline algorithm, which achieves competitive results with the state-of-the-art methods on the CIFAR-10 and LabelMe benchmarks.",

keywords = "Accuracy and diversity, Distance variance, Ensemble method, Hamming distance, Unsupervised hashing",

author = "Yufei Zha and Zhuling Qiu and Peng Zhang and Wei Huang",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2020",

doi = "10.1109/ACCESS.2020.2975883",

language = "英语",

volume = "8",

pages = "42937--42947",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Unsupervised Ensemble Hashing

T2 - Boosting Minimum Hamming Distance

AU - Zha, Yufei

AU - Qiu, Zhuling

AU - Zhang, Peng

AU - Huang, Wei

PY - 2020

Y1 - 2020

N2 - Hashing aims at learning discriminative binary codes of high-dimensional data for the approximate nearest neighbor searching. However, the distance ranking obtained by traditional methods is not optimum in the Hamming space, and it degrades the performance for retrieval tasks. To tackle the above problem, an unsupervised ensemble hashing is proposed to improve the ranking accuracy by assembling the diverse hash tables independently in this paper. We observe that the higher the accuracy is the larger diversity the base learner has, and the more effective the ensemble method is. Based on this principle, two special ensembles hashing approaches are proposed to increase diversity by bootstrap sampling with data-dependent methods. Especially, the results are better when the minimum Hamming distance is large and the variance of the Hamming distance is small. This proposed method is conducted in the experiments and the results show that it can achieve about 10%-25% performance compared with the baseline algorithm, which achieves competitive results with the state-of-the-art methods on the CIFAR-10 and LabelMe benchmarks.

AB - Hashing aims at learning discriminative binary codes of high-dimensional data for the approximate nearest neighbor searching. However, the distance ranking obtained by traditional methods is not optimum in the Hamming space, and it degrades the performance for retrieval tasks. To tackle the above problem, an unsupervised ensemble hashing is proposed to improve the ranking accuracy by assembling the diverse hash tables independently in this paper. We observe that the higher the accuracy is the larger diversity the base learner has, and the more effective the ensemble method is. Based on this principle, two special ensembles hashing approaches are proposed to increase diversity by bootstrap sampling with data-dependent methods. Especially, the results are better when the minimum Hamming distance is large and the variance of the Hamming distance is small. This proposed method is conducted in the experiments and the results show that it can achieve about 10%-25% performance compared with the baseline algorithm, which achieves competitive results with the state-of-the-art methods on the CIFAR-10 and LabelMe benchmarks.

KW - Accuracy and diversity

KW - Distance variance

KW - Ensemble method

KW - Hamming distance

KW - Unsupervised hashing

UR - http://www.scopus.com/inward/record.url?scp=85082056069&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2020.2975883

DO - 10.1109/ACCESS.2020.2975883

M3 - 文章

AN - SCOPUS:85082056069

SN - 2169-3536

VL - 8

SP - 42937

EP - 42947

JO - IEEE Access

JF - IEEE Access

M1 - 9007381

ER -

Unsupervised Ensemble Hashing: Boosting Minimum Hamming Distance

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this