TY - JOUR
T1 - Principal Component Analysis with Fuzzy Elastic Net for Feature Selection
AU - Gao, Yunlong
AU - Wu, Qinting
AU - Xu, Zhenghong
AU - Cao, Chao
AU - Pan, Jinyan
AU - Shao, Guifang
AU - Nie, Feiping
AU - Zhu, Qingyuan
N1 - Publisher Copyright:
© 1993-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Feature selection serves as a fundamental technique in machine learning and data analysis, playing a crucial role in extracting valuable features from large-scale and high-dimensional datasets that may contain irrelevant features. To enhance the performance of feature selection, regularizers like ℓ1-norm or ℓ1-norm are commonly utilized to encourage sparsity. Nonetheless, these traditional regularization techniques encounter certain challenges. When correlations exist among features, the sparsity-driven regularization can unfairly diminish weights of correlated features to zero, thus ignoring the feature correlations and lacking group sparsity properties. While a straightforward combination of ℓ1-norm and ℓ2,1-norm can uncover feature correlations, it lacks adaptability and effectively balancing sparsity and correlation. To address these challenges, we introduce a novel matrix-based regularization term, called a fuzzy elastic net, in the unsupervised feature selection model. Our model is founded on principal component analysis, a well-established dimensionality reduction technique adept at finding subspaces that retain most information from raw data. The model is enhanced by a fuzzy elastic net, which promotes group or sparsity properties through adaptive parameter tuning. The new regularization term introduces a flexible fuzzy weighted scheme combining the ℓ2,2}}-norm and ℓ2,p-norm (0< p≤ 1). This approach allows adaptive adjustment based on data characteristics, offering a tunable balance between selecting discriminative features and identifying correlated ones. Consequently, this regularization term equips the model to handle diverse data analysis tasks flexibly, thereby enhancing adaptability and generalization performance. Furthermore, we propose an efficient optimization strategy to solve this model. Extensive experiments conducted on UCI datasets and real-world datasets demonstrate the effectiveness and efficiency of our proposed method.
AB - Feature selection serves as a fundamental technique in machine learning and data analysis, playing a crucial role in extracting valuable features from large-scale and high-dimensional datasets that may contain irrelevant features. To enhance the performance of feature selection, regularizers like ℓ1-norm or ℓ1-norm are commonly utilized to encourage sparsity. Nonetheless, these traditional regularization techniques encounter certain challenges. When correlations exist among features, the sparsity-driven regularization can unfairly diminish weights of correlated features to zero, thus ignoring the feature correlations and lacking group sparsity properties. While a straightforward combination of ℓ1-norm and ℓ2,1-norm can uncover feature correlations, it lacks adaptability and effectively balancing sparsity and correlation. To address these challenges, we introduce a novel matrix-based regularization term, called a fuzzy elastic net, in the unsupervised feature selection model. Our model is founded on principal component analysis, a well-established dimensionality reduction technique adept at finding subspaces that retain most information from raw data. The model is enhanced by a fuzzy elastic net, which promotes group or sparsity properties through adaptive parameter tuning. The new regularization term introduces a flexible fuzzy weighted scheme combining the ℓ2,2}}-norm and ℓ2,p-norm (0< p≤ 1). This approach allows adaptive adjustment based on data characteristics, offering a tunable balance between selecting discriminative features and identifying correlated ones. Consequently, this regularization term equips the model to handle diverse data analysis tasks flexibly, thereby enhancing adaptability and generalization performance. Furthermore, we propose an efficient optimization strategy to solve this model. Extensive experiments conducted on UCI datasets and real-world datasets demonstrate the effectiveness and efficiency of our proposed method.
KW - Feature selection
KW - fuzzy elastic net
KW - group sparsity
KW - principal component analysis (PCA)
KW - sparsity
UR - http://www.scopus.com/inward/record.url?scp=85212242385&partnerID=8YFLogxK
U2 - 10.1109/TFUZZ.2024.3466926
DO - 10.1109/TFUZZ.2024.3466926
M3 - 文章
AN - SCOPUS:85212242385
SN - 1063-6706
VL - 32
SP - 6878
EP - 6890
JO - IEEE Transactions on Fuzzy Systems
JF - IEEE Transactions on Fuzzy Systems
IS - 12
ER -