TY - JOUR
T1 - Embedded fuzzy C-means joint row-sparse principal component analysis
AU - Wang, Jikui
AU - Li, Xiran
AU - Ma, Yuqi
AU - Liu, Feifei
AU - Nie, Feiping
N1 - Publisher Copyright:
© 2026 Elsevier Inc.
PY - 2026/9/5
Y1 - 2026/9/5
N2 - High-dimensional data clustering faces the well-known “curse of dimensionality”. Traditional methods usually adopt a two-stage strategy, which first reduces the dimension of the data, and then applies clustering algorithms to the reduced data. Traditional high-dimensional data clustering algorithms have two main drawbacks. The first drawback is that the goals of dimensionality reduction and clustering are not necessarily consistent, and the reduced data may not be suitable for clustering. The second drawback is that using feature extraction and feature selection methods alone for dimensionality reduction makes it difficult to find potential data structures that are more suitable for clustering in low-dimensional spaces. To tackle these issues, we propose an embedded fuzzy C-Means joint row-sparse principal component analysis (RS-EFCM), which simultaneously performs feature selection, feature extraction, and clustering tasks. To tackle the challenges posed by the non-smoothness and non-convexity of the l2,0-norm, we employ a coordinate descent approach to seek an optimal solution. The RS-EFCM algorithm has a linear time complexity with respect to the number of samples. We carried out comprehensive experiments on eight datasets to demonstrate the efficacy and convergence properties of the RS-EFCM algorithm. The code is available on the website: https://github.com/LZUFE-Machine-Learning/RS-EFCM.
AB - High-dimensional data clustering faces the well-known “curse of dimensionality”. Traditional methods usually adopt a two-stage strategy, which first reduces the dimension of the data, and then applies clustering algorithms to the reduced data. Traditional high-dimensional data clustering algorithms have two main drawbacks. The first drawback is that the goals of dimensionality reduction and clustering are not necessarily consistent, and the reduced data may not be suitable for clustering. The second drawback is that using feature extraction and feature selection methods alone for dimensionality reduction makes it difficult to find potential data structures that are more suitable for clustering in low-dimensional spaces. To tackle these issues, we propose an embedded fuzzy C-Means joint row-sparse principal component analysis (RS-EFCM), which simultaneously performs feature selection, feature extraction, and clustering tasks. To tackle the challenges posed by the non-smoothness and non-convexity of the l2,0-norm, we employ a coordinate descent approach to seek an optimal solution. The RS-EFCM algorithm has a linear time complexity with respect to the number of samples. We carried out comprehensive experiments on eight datasets to demonstrate the efficacy and convergence properties of the RS-EFCM algorithm. The code is available on the website: https://github.com/LZUFE-Machine-Learning/RS-EFCM.
KW - Feature extraction
KW - Feature selection
KW - Fuzzy c-means clustering
KW - Row-sparse principal component analysis
UR - https://www.scopus.com/pages/publications/105036406453
U2 - 10.1016/j.ins.2026.123552
DO - 10.1016/j.ins.2026.123552
M3 - 文章
AN - SCOPUS:105036406453
SN - 0020-0255
VL - 749
JO - Information Sciences
JF - Information Sciences
M1 - 123552
ER -