Discrete and Balanced Spectral Clustering With Scalability

Research output: Contribution to journalArticlepeer-review

27 Scopus citations

Abstract

Spectral Clustering (SC) has been the main subject of intensive research due to its remarkable clustering performance. Despite its successes, most existing SC methods suffer from several critical issues. First, they typically involve two independent stages, i.e., learning the continuous relaxation matrix followed by the discretization of the cluster indicator matrix. This two-stage approach can result in suboptimal solutions that negatively impact the clustering performance. Second, these methods are hard to maintain the balance property of clusters inherent in many real-world data, which restricts their practical applicability. Finally, these methods are computationally expensive and hence unable to handle large-scale datasets. In light of these limitations, we present a novel Discrete and Balanced Spectral Clustering with Scalability (DBSC) model that integrates the learning the continuous relaxation matrix and the discrete cluster indicator matrix into a single step. Moreover, the proposed model also maintains the size of each cluster approximately equal, thereby achieving soft-balanced clustering. What's more, the DBSC model incorporates an anchor-based strategy to improve its scalability to large-scale datasets. The experimental results demonstrate that our proposed model outperforms existing methods in terms of both clustering performance and balance performance. Specifically, the clustering accuracy of DBSC on CMUPIE data achieved a 17.93% improvement compared with that of the SOTA methods (LABIN, EBSC, etc.).

Original languageEnglish
Pages (from-to)14321-14336
Number of pages16
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
Volume45
Issue number12
DOIs
StatePublished - 1 Dec 2023

Keywords

  • Anchor graph
  • balance regularization
  • spectral clustering
  • spectral rotation

Fingerprint

Dive into the research topics of 'Discrete and Balanced Spectral Clustering With Scalability'. Together they form a unique fingerprint.

Cite this