TY - JOUR
T1 - Joint Anchor Graph Embedding and Discrete Feature Scoring for Unsupervised Feature Selection
AU - Wang, Zheng
AU - Wu, Dongming
AU - Wang, Rong
AU - Nie, Feiping
AU - Wang, Fei
N1 - Publisher Copyright:
© 2012 IEEE.
PY - 2024/6/1
Y1 - 2024/6/1
N2 - The success of existing unsupervised feature selection (UFS) methods heavily relies on the assumption that the intrinsic relationships among original high-dimensional (HD) data samples exist in the discriminative low-dimension (LD) subspace. However, previous UFS methods commonly construct pairwise graphs and employ ℓ2,1-norm regularization to severally preserve the local structure and calculate the score of features, which is computationally complex and easy to get stuck into local optimum, so that those approaches cannot be applied in dealing with large-scale datasets in practice. To overcome this challenge, we propose a novel UFS method, in which a novel anchor graph embedding paradigm is designed to extract the local neighborhood relationships among data samples by reducing the computational complexity of graph construction to be linear in the number of data. Moreover, to improve the optimality of selected features as well as the performance of downstream tasks, we propose a discrete feature scoring mechanism, which imposes orthogonal ℓ2,0-norm constraints on learned projections, in order to enhance the distinction of feature scores as well as reduce the probability of falling into local optimum. In addition, solving the proposed nonconvex and nonsmooth NP-hard problem is challenging, and we present an efficient optimization algorithm to address it and acquire a closed-form solution of the transformation matrix. Extensive experiments demonstrate the effectiveness and efficiency of the proposed UFS by comparison with several state-of-the-art approaches to clustering and image segmentation tasks.
AB - The success of existing unsupervised feature selection (UFS) methods heavily relies on the assumption that the intrinsic relationships among original high-dimensional (HD) data samples exist in the discriminative low-dimension (LD) subspace. However, previous UFS methods commonly construct pairwise graphs and employ ℓ2,1-norm regularization to severally preserve the local structure and calculate the score of features, which is computationally complex and easy to get stuck into local optimum, so that those approaches cannot be applied in dealing with large-scale datasets in practice. To overcome this challenge, we propose a novel UFS method, in which a novel anchor graph embedding paradigm is designed to extract the local neighborhood relationships among data samples by reducing the computational complexity of graph construction to be linear in the number of data. Moreover, to improve the optimality of selected features as well as the performance of downstream tasks, we propose a discrete feature scoring mechanism, which imposes orthogonal ℓ2,0-norm constraints on learned projections, in order to enhance the distinction of feature scores as well as reduce the probability of falling into local optimum. In addition, solving the proposed nonconvex and nonsmooth NP-hard problem is challenging, and we present an efficient optimization algorithm to address it and acquire a closed-form solution of the transformation matrix. Extensive experiments demonstrate the effectiveness and efficiency of the proposed UFS by comparison with several state-of-the-art approaches to clustering and image segmentation tasks.
KW - anchor graph embedding
KW - image segmentation
KW - nonconvex optimization
KW - pattern clustering
KW - unsupervised feature selection
KW - ℓ-norm constraint
UR - http://www.scopus.com/inward/record.url?scp=85144030880&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2022.3222466
DO - 10.1109/TNNLS.2022.3222466
M3 - 文章
C2 - 36417731
AN - SCOPUS:85144030880
SN - 2162-237X
VL - 35
SP - 7974
EP - 7987
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 6
ER -