Structured doubly stochastic matrix for graph based clustering

Xiaoqian Wang, Feiping Nie, Heng Huang

科研成果: 书/报告/会议事项章节会议稿件同行评审

45 引用 (Scopus)

摘要

As one of the most significant machine learning topics, clustering has been extensively employed in various kinds of area. Its prevalent application in scientific research as well as industrial practice has drawn high attention in this day and age. A multitude of clustering methods have been developed, among which the graph based clustering method using the affinity matrix has been laid great emphasis on. Recent research work used the doubly stochastic matrix to normalize the input affinity matrix and enhance the graph based clustering models. Although the doubly stochastic matrix can improve the clustering performance, the clustering structure in the doubly stochastic matrix is not clear as expected. Thus, postprocessing step is required to extract the final clustering results, which may not be optimal. To address this problem, in this paper, we propose a novel convex model to learn the structured doubly stochastic matrix by imposing low-rank constraint on the graph Laplacian matrix. Our new structured doubly stochastic matrix can explicitly uncover the clustering structure and encode the probabilities of pair-wise data points to be connected, such that the clustering results are enhanced. An efficient optimization algorithm is derived to solve our new objective. Also, we provide theoretical discussions that when the input differs, our method possesses interesting connections with K-means and spectral graph cut models respectively. We conduct experiments on both synthetic and benchmark datasets to validate the performance of our proposed method. The empirical results demonstrate that our model provides an approach to better solving the K-mean clustering problem. By using the cluster indicator provided by our model as initialization, Kmeans converges to a smaller objective function value with better clustering performance. Moreover, we compare the clustering performance of our model with spectral clustering and related double stochastic model. On all datasets, our method performs equally or better than the related methods.

源语言英语
主期刊名KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
出版商Association for Computing Machinery
1245-1254
页数10
ISBN(电子版)9781450342322
DOI
出版状态已出版 - 13 8月 2016
已对外发布
活动22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016 - San Francisco, 美国
期限: 13 8月 201617 8月 2016

出版系列

姓名Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
13-17-August-2016

会议

会议22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016
国家/地区美国
San Francisco
时期13/08/1617/08/16

指纹

探究 'Structured doubly stochastic matrix for graph based clustering' 的科研主题。它们共同构成独一无二的指纹。

引用此