An efficient semi-supervised balanced cut with hard pairwise constraints and partial labels

Weizhong Yu, Liyin Xing, Feiping Nie, Xuelong Li

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

With a small scale of prior information, including pairwise constraints and partial labels, semi-supervised clustering algorithms can improve clustering performance by imposing certain restrictions on the clustering results, and have accordingly attracted a lot of attention in recent years. Moreover, a large number of graph-based semi-supervised clustering algorithms have been proposed, since graphs are naturally suited to pairwise constraints. However, most graph-based approaches involve a two-step process and are not scalable to large-scale datasets due to their high time complexity. Additionally, they also encounter a serious problem of constraint violation. To deal with this issue, we propose a new non-parametric model to directly obtain the indicator matrix by minimizing the cut of the graph with a regularization term that corresponds to the balance of the clusters. To simultaneously support the two forms of prior information, our approach restricts the indicator matrix during optimization, while also ensuring that the clusters obtained by our approach do not violate any constraints. The computational complexity of the proposed approach is nearly linear to the data size n. Experiments show that the algorithm proposed in this paper achieves significant improvements in both clustering performance and efficiency.

Original languageEnglish
Article number110747
JournalKnowledge-Based Systems
Volume276
DOIs
StatePublished - 27 Sep 2023

Keywords

  • Graph-based clustering
  • Pairwise constraints
  • Semi-supervised clustering

Fingerprint

Dive into the research topics of 'An efficient semi-supervised balanced cut with hard pairwise constraints and partial labels'. Together they form a unique fingerprint.

Cite this