Multi-view K-means clustering on big data

Xiao Cai, Feiping Nie, Heng Huang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

625 Scopus citations

Abstract

In past decade, more and more data are collected from multiple sources or represented by multiple views, where different views describe distinct perspectives of the data. Although each view could be individually used for finding patterns by clustering, the clustering performance could be more accurate by exploring the rich information among multiple views. Several multi-view clustering methods have been proposed to unsupervised integrate different views of data. However, they are graph based approaches, e.g. based on spectral clustering, such that they cannot handle the large-scale data. How to combine these heterogeneous features for unsupervised large-scale data clustering has become a challenging problem. In this paper, we propose a new robust large-scale multi-view clustering method to integrate heterogeneous representations of largescale data. We evaluate the proposed new methods by six benchmark data sets and compared the performance with several commonly used clustering approaches as well as the baseline multi-view clustering methods. In all experimental results, our proposed methods consistently achieve superiors clustering performances.

Original languageEnglish
Title of host publicationIJCAI 2013 - Proceedings of the 23rd International Joint Conference on Artificial Intelligence
Pages2598-2604
Number of pages7
StatePublished - 2013
Externally publishedYes
Event23rd International Joint Conference on Artificial Intelligence, IJCAI 2013 - Beijing, China
Duration: 3 Aug 20139 Aug 2013

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
ISSN (Print)1045-0823

Conference

Conference23rd International Joint Conference on Artificial Intelligence, IJCAI 2013
Country/TerritoryChina
CityBeijing
Period3/08/139/08/13

Fingerprint

Dive into the research topics of 'Multi-view K-means clustering on big data'. Together they form a unique fingerprint.

Cite this