Large-scale spectral clustering based on representative points

Libo Yang, Xuemei Liu, Feiping Nie, Mingtang Liu

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Spectral clustering (SC) has attracted more and more attention due to its effectiveness in machine learning. However, most traditional spectral clustering methods still face challenges in the successful application of large-scale spectral clustering problems mainly due to their high computational complexity n3, where n is the number of samples. In order to achieve fast spectral clustering, we propose a novel approach, called representative point-based spectral clustering (RPSC), to efficiently deal with the large-scale spectral clustering problem. The proposed method first generates two-layer representative points successively by BKHK (balanced k-means-based hierarchical k-means). Then it constructs the hierarchical bipartite graph and performs spectral analysis on the graph. Specifically, we construct the similarity matrix using the parameter-free neighbor assignment method, which avoids the need to tune the extra parameters. Furthermore, we perform the coclustering on the final similarity matrix. The coclustering mechanism takes advantage of the cooccurring cluster structure among the representative points and the original data to strengthen the clustering performance. As a result, the computational complexity can be significantly reduced and the clustering accuracy can be improved. Extensive experiments on several large-scale data sets show the effectiveness, efficiency, and stability of the proposed method.

Original languageEnglish
Article number5864020
JournalMathematical Problems in Engineering
Volume2019
DOIs
StatePublished - 2019

Fingerprint

Dive into the research topics of 'Large-scale spectral clustering based on representative points'. Together they form a unique fingerprint.

Cite this