Learning a Mahalanobis distance metric for data clustering and classification

Shiming Xiang, Feiping Nie, Changshui Zhang

Research output: Contribution to journalArticlepeer-review

541 Scopus citations

Abstract

Distance metric is a key issue in many machine learning algorithms. This paper considers a general problem of learning from pairwise constraints in the form of must-links and cannot-links. As one kind of side information, a must-link indicates the pair of the two data points must be in a same class, while a cannot-link indicates that the two data points must be in two different classes. Given must-link and cannot-link information, our goal is to learn a Mahalanobis distance metric. Under this metric, we hope the distances of point pairs in must-links are as small as possible and those of point pairs in cannot-links are as large as possible. This task is formulated as a constrained optimization problem, in which the global optimum can be obtained effectively and efficiently. Finally, some applications in data clustering, interactive natural image segmentation and face pose estimation are given in this paper. Experimental results illustrate the effectiveness of our algorithm.

Original languageEnglish
Pages (from-to)3600-3612
Number of pages13
JournalPattern Recognition
Volume41
Issue number12
DOIs
StatePublished - Dec 2008
Externally publishedYes

Keywords

  • Data clustering
  • Distance metric learning
  • Face pose estimation
  • Global optimization
  • Interactive image segmentation
  • Mahalanobis distance

Fingerprint

Dive into the research topics of 'Learning a Mahalanobis distance metric for data clustering and classification'. Together they form a unique fingerprint.

Cite this