Abstract
k-means is a widely used and classical clustering algorithm, with numerous studies dedicated to enhancing its performance. In this paper, we first review previous approaches to improving k-means, and highlight the significance of greedy strategies. Inspired by this, we propose a novel greedy-based optimization algorithm for k-means, referred to as greedy local k-means (GLKM). GLKM initially treats each sample as an individual cluster and employs a greedy merging strategy to iteratively merge pairs of clusters, based on minimizing incremental k-means loss. The k-nearest neighbors graph is utilized to focus on local structure and improve efficiency. A red-black tree is employed to directly retrieve merged cluster pairs. GLKM yields deterministic clustering results and does not require random initialization or multiple runs to obtain optimal clustering results. In addition, it avoids producing empty clusters. Experiments on synthetic and benchmark datasets demonstrate the effectiveness and superiority of GLKM. Its ability to handle various data types and scales makes it a reliable choice for a wide range of real-world scenarios.
| Original language | English |
|---|---|
| Article number | 112140 |
| Journal | Pattern Recognition |
| Volume | 171 |
| DOIs | |
| State | Published - Mar 2026 |
Keywords
- Clustering
- Greedy heuristic
- K-means
- K-nearest neighbors graph
- Red-black tree
Fingerprint
Dive into the research topics of 'Efficient greedy optimization method for k-means'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver