TY - JOUR
T1 - Fast Self-Supervised Clustering With Anchor Graph
AU - Wang, Jingyu
AU - Ma, Zhenyu
AU - Nie, Feiping
AU - Li, Xuelong
N1 - Publisher Copyright:
© 2012 IEEE.
PY - 2022/9/1
Y1 - 2022/9/1
N2 - Benefit from avoiding the utilization of labeled samples, which are usually insufficient in the real world, unsupervised learning has been regarded as a speedy and powerful strategy on clustering tasks. However, clustering directly from primal data sets leads to high computational cost, which limits its application on large-scale and high-dimensional problems. Recently, anchor-based theories are proposed to partly mitigate this problem and field naturally sparse affinity matrix, while it is still a challenge to get excellent performance along with high efficiency. To dispose of this issue, we first presented a fast semisupervised framework (FSSF) combined with a balanced $K$ -means-based hierarchical $K$ -means (BKHK) method and the bipartite graph theory. Thereafter, we proposed a fast self-supervised clustering method involved in this crucial semisupervised framework, in which all labels are inferred from a constructed bipartite graph with exactly $k$ connected components. The proposed method remarkably accelerates the general semisupervised learning through the anchor and consists of four significant parts: 1) obtaining the anchor set as interim through BKHK algorithm; 2) constructing the bipartite graph; 3) solving the self-supervised problem to construct a typical probability model with FSSF; and 4) selecting the most representative points regarding anchors from BKHK as an interim and conducting label propagation. The experimental results on toy examples and benchmark data sets have demonstrated that the proposed method outperforms other approaches.
AB - Benefit from avoiding the utilization of labeled samples, which are usually insufficient in the real world, unsupervised learning has been regarded as a speedy and powerful strategy on clustering tasks. However, clustering directly from primal data sets leads to high computational cost, which limits its application on large-scale and high-dimensional problems. Recently, anchor-based theories are proposed to partly mitigate this problem and field naturally sparse affinity matrix, while it is still a challenge to get excellent performance along with high efficiency. To dispose of this issue, we first presented a fast semisupervised framework (FSSF) combined with a balanced $K$ -means-based hierarchical $K$ -means (BKHK) method and the bipartite graph theory. Thereafter, we proposed a fast self-supervised clustering method involved in this crucial semisupervised framework, in which all labels are inferred from a constructed bipartite graph with exactly $k$ connected components. The proposed method remarkably accelerates the general semisupervised learning through the anchor and consists of four significant parts: 1) obtaining the anchor set as interim through BKHK algorithm; 2) constructing the bipartite graph; 3) solving the self-supervised problem to construct a typical probability model with FSSF; and 4) selecting the most representative points regarding anchors from BKHK as an interim and conducting label propagation. The experimental results on toy examples and benchmark data sets have demonstrated that the proposed method outperforms other approaches.
KW - Bipartite graph
KW - label propagation
KW - self-supervised learning
KW - semisupervised framework
KW - special selection
UR - http://www.scopus.com/inward/record.url?scp=85100919990&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2021.3056080
DO - 10.1109/TNNLS.2021.3056080
M3 - 文章
C2 - 33587715
AN - SCOPUS:85100919990
SN - 2162-237X
VL - 33
SP - 4199
EP - 4212
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 9
ER -