Anchor-graph regularized orthogonal concept factorization for document clustering

Ben Yang, Zhiyuan Xue, Jinghan Wu, Xuetao Zhang, Feiping Nie, Badong Chen

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Concept factorization (CF) has attracted widespread attention for its promising performance in document clustering. Among various CF variants, graph-regularized CF is the most impressive type, which can improve clustering effectiveness by exploring structural information. Nevertheless, their clustering efficiency is restricted by the following considerations: (1) the introduction of the full-sample graph is accompanied by an increase in computational complexity; (2) most of them require intensive multiplications in optimization, which impair the optimization efficiency. To address these issues, in this work, we propose an anchor-graph regularized orthogonal concept factorization (AROCF) method to enhance the clustering efficiency and effectiveness in document clustering tasks. Firstly, AROCF approximates the full-sample graph with a small-scale anchor graph to reduce the complexity of graph construction from quadratic to linear. Then, one of the factor matrices is constrained as the cluster indicator matrix in our method, which can avoid extra efficiency loss in K-means after optimization. Finally, an orthogonal constraint is employed to restrict the freedom of factorization to increase the clustering effectiveness. To optimize the AROCF model, we develop a fast optimization strategy by combining the trace and orthogonality of matrices. Extensive experiments on various document datasets demonstrate the effectiveness and efficiency of AROCF.

Original languageEnglish
Article number127173
JournalNeurocomputing
Volume573
DOIs
StatePublished - 7 Mar 2024

Keywords

  • Anchor graph
  • Concept factorization
  • Document clustering
  • Orthogonality

Fingerprint

Dive into the research topics of 'Anchor-graph regularized orthogonal concept factorization for document clustering'. Together they form a unique fingerprint.

Cite this