TY - JOUR
T1 - Robust Principal Component Analysis Based on Fuzzy Local Information Reservation
AU - Gao, Yunlong
AU - Wang, Xinjing
AU - Xie, Jiaxin
AU - Pan, Jinyan
AU - Yan, Peng
AU - Nie, Feiping
N1 - Publisher Copyright:
© 1979-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Principal Component Analysis (PCA) aims to acquire the principal component space containing the essential structure of data, instead of being used for mining and extracting the essential structure of data. In other words, the principal component space contains not only information related to the essential structure of data but also some unrelated information. This frequently occurs when the intrinsic dimensionality of data is unknown or when it has complex distribution characteristics such as multi-modalities, manifolds, etc. Therefore, it is unreasonable to identify noise and useful information based solely on reconstruction error. For this reason, PCA is unsuitable as a preprocessing technique for most applications, especially in noisy environment. To solve this problem, this paper proposes robust PCA based on fuzzy local information reservation (FLIPCA). By analyzing the impact of reconstruction error on sample discriminability, FLIPCA provides a theoretical basis for noise identification and processing. This not only greatly improves its robustness but also extends its applicability and effectiveness as a data preprocessing technique. Meanwhile, FLIPCA maintains consistent mathematical descriptions with traditional PCA while having few adjustable hyperparameters and low algorithmic complexity. Finally, we conducted comprehensive experiments on synthetic and real-world datasets, which substantiated the superiority of our proposed algorithm.
AB - Principal Component Analysis (PCA) aims to acquire the principal component space containing the essential structure of data, instead of being used for mining and extracting the essential structure of data. In other words, the principal component space contains not only information related to the essential structure of data but also some unrelated information. This frequently occurs when the intrinsic dimensionality of data is unknown or when it has complex distribution characteristics such as multi-modalities, manifolds, etc. Therefore, it is unreasonable to identify noise and useful information based solely on reconstruction error. For this reason, PCA is unsuitable as a preprocessing technique for most applications, especially in noisy environment. To solve this problem, this paper proposes robust PCA based on fuzzy local information reservation (FLIPCA). By analyzing the impact of reconstruction error on sample discriminability, FLIPCA provides a theoretical basis for noise identification and processing. This not only greatly improves its robustness but also extends its applicability and effectiveness as a data preprocessing technique. Meanwhile, FLIPCA maintains consistent mathematical descriptions with traditional PCA while having few adjustable hyperparameters and low algorithmic complexity. Finally, we conducted comprehensive experiments on synthetic and real-world datasets, which substantiated the superiority of our proposed algorithm.
KW - data preprocessing
KW - discriminative information
KW - noise identification
KW - Principal component analysis
KW - robustness
UR - https://www.scopus.com/pages/publications/85197086656
U2 - 10.1109/TPAMI.2024.3418983
DO - 10.1109/TPAMI.2024.3418983
M3 - 文章
C2 - 38917285
AN - SCOPUS:85197086656
SN - 0162-8828
VL - 46
SP - 9321
EP - 9337
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 12
ER -