TY - JOUR
T1 - Fast Sparse Discriminative K-Means for Unsupervised Feature Selection
AU - Nie, Feiping
AU - Ma, Zhenyu
AU - Wang, Jingyu
AU - Li, Xuelong
N1 - Publisher Copyright:
© 2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Embedded feature selection approach guides subsequent projection matrix (selection matrix) learning through the acquisition of pseudolabel matrix to conduct feature selection tasks. Yet the continuous pseudolabel matrix learned from relaxed problem based on spectral analysis deviates from reality to some extent. To cope with this issue, we design an efficient feature selection framework inspired by classical least-squares regression (LSR) and discriminative K-means (DisK-means), which is called the fast sparse discriminative K-means (FSDK) for the feature selection method. First, the weighted pseudolabel matrix with discrete trait is introduced to avoid trivial solution from unsupervised LSR. On this condition, any constraint imposed into pseudolabel matrix and selection matrix is dispensable, which is significantly beneficial to simplify the combinational optimization problem. Second, the ℓ2,p -norm regularizer is introduced to satisfy the row sparsity of selection matrix with flexible p. Consequently, the proposed FSDK model can be treated as a novel feature selection framework integrated from the DisK-means algorithm and ℓ2,p -norm regularizer to optimize the sparse regression problem. Moreover, our model is linearly correlated with the number of samples, which is speedy to handle the large-scale data. Comprehensive tests on various data terminally illuminate the effectiveness and efficiency of FSDK.
AB - Embedded feature selection approach guides subsequent projection matrix (selection matrix) learning through the acquisition of pseudolabel matrix to conduct feature selection tasks. Yet the continuous pseudolabel matrix learned from relaxed problem based on spectral analysis deviates from reality to some extent. To cope with this issue, we design an efficient feature selection framework inspired by classical least-squares regression (LSR) and discriminative K-means (DisK-means), which is called the fast sparse discriminative K-means (FSDK) for the feature selection method. First, the weighted pseudolabel matrix with discrete trait is introduced to avoid trivial solution from unsupervised LSR. On this condition, any constraint imposed into pseudolabel matrix and selection matrix is dispensable, which is significantly beneficial to simplify the combinational optimization problem. Second, the ℓ2,p -norm regularizer is introduced to satisfy the row sparsity of selection matrix with flexible p. Consequently, the proposed FSDK model can be treated as a novel feature selection framework integrated from the DisK-means algorithm and ℓ2,p -norm regularizer to optimize the sparse regression problem. Moreover, our model is linearly correlated with the number of samples, which is speedy to handle the large-scale data. Comprehensive tests on various data terminally illuminate the effectiveness and efficiency of FSDK.
KW - embedded feature selection
KW - least-square regression (LSR)
KW - sparse discriminative K-means
KW - trivial solution
KW - weighted pseudolabel matrix
KW - ℓ-norm regularizer
UR - http://www.scopus.com/inward/record.url?scp=85147316303&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2023.3238103
DO - 10.1109/TNNLS.2023.3238103
M3 - 文章
C2 - 37022041
AN - SCOPUS:85147316303
SN - 2162-237X
VL - 35
SP - 9943
EP - 9957
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 7
ER -