TY - JOUR
T1 - Fast spectral clustering learning with hierarchical bipartite graph for large-scale data
AU - Yang, Xiaojun
AU - Yu, Weizhong
AU - Wang, Rong
AU - Zhang, Guohao
AU - Nie, Feiping
N1 - Publisher Copyright:
© 2018 Elsevier B.V.
PY - 2020/2
Y1 - 2020/2
N2 - Spectral clustering (SC) is drawing more and more attention due to its effectiveness in unsupervised learning. However, all of these methods still have limitations. First, the method is not suitable for large-scale problems due to its high computational complexity. Second, the neighborhood weighted graph is constructed by the Gaussian kernel, meaning that more work is required to tune the heat-kernel parameter. In order to overcome these issues, we propose a novel spectral clustering based on hierarchical bipartite graph (SCHBG) approach by exploring multiple-layer anchors with a pyramid-style structure. First, the proposed algorithm constructs a hierarchical bipartite graph, and then performs spectral analysis on the graph. As a result, the computational complexity can be largely reduced. Furthermore, we adopt a parameter-free yet effective neighbor assignment strategy to construct the similarity matrix, which avoids the need to tune the heat-kernel parameter. Finally, the algorithm is able to deal with the out-of-sample problem for large-scale data and its computational complexity is significantly reduced. Experiments demonstrate the efficiency and effectiveness of the proposed SCHBG algorithm. Results show that the SCHBG approach can achieve good clustering accuracy (76%) on an 8-million datasets. Furthermore, owing to the use of the bipartite graph, the algorithm can reduce the time cost for out-of-sample situations with almost the same clustering accuracy as for large sizes of data.
AB - Spectral clustering (SC) is drawing more and more attention due to its effectiveness in unsupervised learning. However, all of these methods still have limitations. First, the method is not suitable for large-scale problems due to its high computational complexity. Second, the neighborhood weighted graph is constructed by the Gaussian kernel, meaning that more work is required to tune the heat-kernel parameter. In order to overcome these issues, we propose a novel spectral clustering based on hierarchical bipartite graph (SCHBG) approach by exploring multiple-layer anchors with a pyramid-style structure. First, the proposed algorithm constructs a hierarchical bipartite graph, and then performs spectral analysis on the graph. As a result, the computational complexity can be largely reduced. Furthermore, we adopt a parameter-free yet effective neighbor assignment strategy to construct the similarity matrix, which avoids the need to tune the heat-kernel parameter. Finally, the algorithm is able to deal with the out-of-sample problem for large-scale data and its computational complexity is significantly reduced. Experiments demonstrate the efficiency and effectiveness of the proposed SCHBG algorithm. Results show that the SCHBG approach can achieve good clustering accuracy (76%) on an 8-million datasets. Furthermore, owing to the use of the bipartite graph, the algorithm can reduce the time cost for out-of-sample situations with almost the same clustering accuracy as for large sizes of data.
KW - Bipartite graph
KW - Hierarchical graph
KW - Large scale data
KW - Out-of-sample
KW - Spectral clustering
UR - http://www.scopus.com/inward/record.url?scp=85050530100&partnerID=8YFLogxK
U2 - 10.1016/j.patrec.2018.06.024
DO - 10.1016/j.patrec.2018.06.024
M3 - 文章
AN - SCOPUS:85050530100
SN - 0167-8655
VL - 130
SP - 345
EP - 352
JO - Pattern Recognition Letters
JF - Pattern Recognition Letters
ER -