Efficient greedy optimization method for k-means

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

k-means is a widely used and classical clustering algorithm, with numerous studies dedicated to enhancing its performance. In this paper, we first review previous approaches to improving k-means, and highlight the significance of greedy strategies. Inspired by this, we propose a novel greedy-based optimization algorithm for k-means, referred to as greedy local k-means (GLKM). GLKM initially treats each sample as an individual cluster and employs a greedy merging strategy to iteratively merge pairs of clusters, based on minimizing incremental k-means loss. The k-nearest neighbors graph is utilized to focus on local structure and improve efficiency. A red-black tree is employed to directly retrieve merged cluster pairs. GLKM yields deterministic clustering results and does not require random initialization or multiple runs to obtain optimal clustering results. In addition, it avoids producing empty clusters. Experiments on synthetic and benchmark datasets demonstrate the effectiveness and superiority of GLKM. Its ability to handle various data types and scales makes it a reliable choice for a wide range of real-world scenarios.

Original languageEnglish
Article number112140
JournalPattern Recognition
Volume171
DOIs
StatePublished - Mar 2026

Keywords

  • Clustering
  • Greedy heuristic
  • K-means
  • K-nearest neighbors graph
  • Red-black tree

Fingerprint

Dive into the research topics of 'Efficient greedy optimization method for k-means'. Together they form a unique fingerprint.

Cite this