TY - JOUR
T1 - Large-Scale Clustering With Structured Optimal Bipartite Graph
AU - Zhang, Han
AU - Nie, Feiping
AU - Li, Xuelong
N1 - Publisher Copyright:
© 1979-2012 IEEE.
PY - 2023/8/1
Y1 - 2023/8/1
N2 - The widespread arising of data size gives rise to the necessity of undertaking large-scale data clustering tasks. To do so, the bipartite graph theory is frequently applied to design a scalable algorithm, which depicts the relations between samples and a few anchors, instead of binding pairwise samples. However, the bipartite graphs and existing spectral embedding methods ignore the explicit cluster structure learning. They have to obtain cluster labels by using post-processing like K-Means. More than that, existing anchor-based approaches always acquire anchors by using centroids of K-Means or a few random samples, both of which are time-saving but performance-unstable. In this paper, we investigate the scalability, stableness and integration in large-scale graph clustering. We propose a cluster-structured graph learning model, thus obtaining a cc-connected (c c is the cluster number) bipartite graph and also getting discrete labels straightforward. Taking data feature or pairwise relation as a start point, we further design an initialization-independent anchor selection strategy. Experimental results reported for synthetic and real-world datasets demonstrate the proposed method outperforms its peers.
AB - The widespread arising of data size gives rise to the necessity of undertaking large-scale data clustering tasks. To do so, the bipartite graph theory is frequently applied to design a scalable algorithm, which depicts the relations between samples and a few anchors, instead of binding pairwise samples. However, the bipartite graphs and existing spectral embedding methods ignore the explicit cluster structure learning. They have to obtain cluster labels by using post-processing like K-Means. More than that, existing anchor-based approaches always acquire anchors by using centroids of K-Means or a few random samples, both of which are time-saving but performance-unstable. In this paper, we investigate the scalability, stableness and integration in large-scale graph clustering. We propose a cluster-structured graph learning model, thus obtaining a cc-connected (c c is the cluster number) bipartite graph and also getting discrete labels straightforward. Taking data feature or pairwise relation as a start point, we further design an initialization-independent anchor selection strategy. Experimental results reported for synthetic and real-world datasets demonstrate the proposed method outperforms its peers.
KW - Anchor selection
KW - bipartite graph
KW - discrete labels
KW - large-scale clustering
KW - pairwise relation
UR - http://www.scopus.com/inward/record.url?scp=85160232256&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2023.3277532
DO - 10.1109/TPAMI.2023.3277532
M3 - 文章
C2 - 37200121
AN - SCOPUS:85160232256
SN - 0162-8828
VL - 45
SP - 9950
EP - 9963
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 8
ER -