TY - JOUR
T1 - Comprehensive Information Extraction with Separable Representation Learning for Multi-View Clustering
AU - Wang, Penglei
AU - Wu, Danyang
AU - Xu, Jin
AU - Nie, Feiping
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Deep Multi-View Clustering (MVC) methods partition multi-view data into disjoint clusters in an unsupervised manner, showing significant promise across various domains. However, current MVC methods primarily focus on capturing the consistency information shared across all views and undervalue the specificity information inherent in each view that reflects its unique characteristics. Furthermore, the underexploration of the separability of learned representations limits the overall clustering performance of existing MVC methods and leads to undesirable clustering results. In this paper, we propose a fully differentiable and end-to-end deep MVC framework, named Comprehensive Information Extraction with Separable Representation Learning (CIRSEL), to address these issues. CIRSEL recasts specificity information extraction as a high-order graph pooling process to capture the view-specific characteristics of individual views. Utilizing the cross-attention mechanism, CIRSEL adaptively fuses the consistent and view-specific representations to achieve comprehensive information extraction. Subsequently, CIRSEL maps representations into a unit hypersphere space with evenly distributed prototypes and maximizes the variational estimation of Mutual Information, which enhances the inter-cluster separability and intra-cluster compactness in the embedding space and further benefits the following clustering learning. Finally, CIRSEL introduces a nuclear norm-based balance regularization, which ensures balanced clustering results can be directly retrieved by the cosine similarity between the representations and prototypes. Extensive experiments on ten benchmark datasets demonstrate the effectiveness of CIRSEL compared to sixteen current MVC methods.
AB - Deep Multi-View Clustering (MVC) methods partition multi-view data into disjoint clusters in an unsupervised manner, showing significant promise across various domains. However, current MVC methods primarily focus on capturing the consistency information shared across all views and undervalue the specificity information inherent in each view that reflects its unique characteristics. Furthermore, the underexploration of the separability of learned representations limits the overall clustering performance of existing MVC methods and leads to undesirable clustering results. In this paper, we propose a fully differentiable and end-to-end deep MVC framework, named Comprehensive Information Extraction with Separable Representation Learning (CIRSEL), to address these issues. CIRSEL recasts specificity information extraction as a high-order graph pooling process to capture the view-specific characteristics of individual views. Utilizing the cross-attention mechanism, CIRSEL adaptively fuses the consistent and view-specific representations to achieve comprehensive information extraction. Subsequently, CIRSEL maps representations into a unit hypersphere space with evenly distributed prototypes and maximizes the variational estimation of Mutual Information, which enhances the inter-cluster separability and intra-cluster compactness in the embedding space and further benefits the following clustering learning. Finally, CIRSEL introduces a nuclear norm-based balance regularization, which ensures balanced clustering results can be directly retrieved by the cosine similarity between the representations and prototypes. Extensive experiments on ten benchmark datasets demonstrate the effectiveness of CIRSEL compared to sixteen current MVC methods.
KW - Deep Clustering
KW - Multi-View Clustering
KW - Multi-View Learning
KW - Representation Learning
UR - http://www.scopus.com/inward/record.url?scp=105005841676&partnerID=8YFLogxK
U2 - 10.1109/TCSVT.2025.3571787
DO - 10.1109/TCSVT.2025.3571787
M3 - 文章
AN - SCOPUS:105005841676
SN - 1051-8215
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
ER -