TY - JOUR
T1 - Fast adaptive neighbors clustering via embedded clustering
AU - Liu, Yijun
AU - Cai, Yongda
AU - Yang, Xiaojun
AU - Nie, Feiping
AU - Ye, Wujian
N1 - Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2020/7/25
Y1 - 2020/7/25
N2 - Recently, spectral clustering (SC) has been gaining more and more attention due to its excellent performance in unsupervised learning. However, the computational complexity of the SC is high. Also, the adjacency graph matrix of the SC is ofen constructed by the Gaussian kernel, so the clustering result is sensitive to the kernel parameter σ. Since most large-scale datasets are high-dimensional and sparse, it is a great challenge to apply the SC to these data. Therefore, a fast adaptive neighbor clustering method based on the embedded clustering (FANCEC) is proposed. First, m anchors are selected from raw data. Next, a bipartite graph matrix Z connecting the raw data and anchors is constructed in a parameter-free manner. Then, the graph embedded data are obtained from raw data by the singular value decomposition (SVD) method. The graph embedded data extracts and combines valid information from raw data while discarding the redundant information. After that, m anchors are selected from graph embedded data, and the adjacency matrix S is initialized. Finally, the adaptive neighbor strategy is used to update matrix S until optimal function convergences. The clustering result of the FANCEC can be obtained directly without the post-processing that is required in the k-means method. The experimental results show that the proposed FANCEC can reduce time-consumption for large-scale data and obtain a good comprehensive clustering effect compared with the traditional SC methods.
AB - Recently, spectral clustering (SC) has been gaining more and more attention due to its excellent performance in unsupervised learning. However, the computational complexity of the SC is high. Also, the adjacency graph matrix of the SC is ofen constructed by the Gaussian kernel, so the clustering result is sensitive to the kernel parameter σ. Since most large-scale datasets are high-dimensional and sparse, it is a great challenge to apply the SC to these data. Therefore, a fast adaptive neighbor clustering method based on the embedded clustering (FANCEC) is proposed. First, m anchors are selected from raw data. Next, a bipartite graph matrix Z connecting the raw data and anchors is constructed in a parameter-free manner. Then, the graph embedded data are obtained from raw data by the singular value decomposition (SVD) method. The graph embedded data extracts and combines valid information from raw data while discarding the redundant information. After that, m anchors are selected from graph embedded data, and the adjacency matrix S is initialized. Finally, the adaptive neighbor strategy is used to update matrix S until optimal function convergences. The clustering result of the FANCEC can be obtained directly without the post-processing that is required in the k-means method. The experimental results show that the proposed FANCEC can reduce time-consumption for large-scale data and obtain a good comprehensive clustering effect compared with the traditional SC methods.
KW - Adaptive neighbor method
KW - Anchor-based graph embedded
KW - Fast clustering
KW - Large-scale data
UR - http://www.scopus.com/inward/record.url?scp=85081914207&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2020.02.087
DO - 10.1016/j.neucom.2020.02.087
M3 - 文章
AN - SCOPUS:85081914207
SN - 0925-2312
VL - 399
SP - 331
EP - 341
JO - Neurocomputing
JF - Neurocomputing
ER -