Nonnegative Matrix Tri-Factorization based high-order co-clustering and its fast implementation

Hua Wang, Feiping Nie, Heng Huang, Chris Ding

科研成果: 书/报告/会议事项章节会议稿件同行评审

52 引用 (Scopus)

摘要

The fast growth of Internet and modern technologies has brought data involving objects of multiple types that are related to each other, called as Multi-Type Relational data. Traditional clustering methods for single-type data rarely work well on them, which calls for new clustering techniques, called as high-order co-clustering (HOCC), to deal with the multiple types of data at the same time. A major challenge in developing HOCC methods is how to effectively make use of all available information contained in a multi-type relational data set, including both inter-type and intra-type relationships. Meanwhile, because many real world data sets are often of large sizes, clustering methods with computationally efficient solution algorithms are of great practical interest. In this paper, we first present a general HOCC framework, named as Orthogonal Nonnegative Matrix Tri-factorization (O-NMTF), for simultaneous clustering of multi-type relational data. The proposed O-NMTF approach employs Nonnegative Matrix Tri- Factorization (NMTF) to simultaneously cluster different types of data using the inter-type relationships, and incorporate intra-type information through manifold regularization, where, different from existing works, we emphasize the importance of the orthogonalities of the factor matrices of NMTF. Based on O-NMTF, we further develop a novel Fast Nonnegative Matrix Tri-Factorization (F-NMTF) approach to deal with large-scale data. Instead of constraining the factor matrices of NMTF to be nonnegative as in existing methods, F-NMTF constrains them to be cluster indicator matrices, a special type of nonnegative matrices. As a result, the optimization problem of the proposed method can be decoupled, which results in subproblems of much smaller sizes requiring much less matrix multiplications, such that our new algorithm scales well to real world data of large sizes. Extensive experimental evaluations have demonstrated the effectiveness of our new approaches.

源语言英语
主期刊名Proceedings - 11th IEEE International Conference on Data Mining, ICDM 2011
774-783
页数10
DOI
出版状态已出版 - 2011
已对外发布
活动11th IEEE International Conference on Data Mining, ICDM 2011 - Vancouver, BC, 加拿大
期限: 11 12月 201114 12月 2011

出版系列

姓名Proceedings - IEEE International Conference on Data Mining, ICDM
ISSN(印刷版)1550-4786

会议

会议11th IEEE International Conference on Data Mining, ICDM 2011
国家/地区加拿大
Vancouver, BC
时期11/12/1114/12/11

指纹

探究 'Nonnegative Matrix Tri-Factorization based high-order co-clustering and its fast implementation' 的科研主题。它们共同构成独一无二的指纹。

引用此