TY - JOUR
T1 - Data Subdivision Based Dual-Weighted Robust Principal Component Analysis
AU - Wang, Sisi
AU - Nie, Feiping
AU - Wang, Zheng
AU - Wang, Rong
AU - Li, Xuelong
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Principal Component Analysis (PCA) is one of the most important unsupervised dimensionality reduction algorithms, which uses squared ℓ2-norm to make it very sensitive to outliers. Those improved versions based on ℓ1-norm alleviate this problem, but they have other shortcomings, such as optimization difficulties or lack of rotational invariance, etc. Besides, existing methods only vaguely divide normal samples and outliers to improve robustness, but they ignore the fact that normal samples can be more specifically divided into positive samples and hard samples, which should have different contributions to the model because positive samples are more conducive to learning the projection matrix. In this paper, we propose a novel Data Subdivision Based Dual-Weighted Robust Principal Component Analysis, namely DRPCA, which firstly designs a mark vector to distinguish normal samples and outliers, and directly removes outliers according to mark weights. Moreover, we further divide normal samples into positive samples and hard samples by self-constrained weights, and place them in relative positions, so that the weight of positive samples is larger than hard samples, which makes the projection matrix more accurate. Additionally, the optimal mean is employed to obtain a more accurate data center. To solve this problem, we carefully design an effective iterative algorithm and analyze its convergence. Experiments on real-world and RGB large-scale datasets demonstrate the superiority of our method in dimensionality reduction and anomaly detection.
AB - Principal Component Analysis (PCA) is one of the most important unsupervised dimensionality reduction algorithms, which uses squared ℓ2-norm to make it very sensitive to outliers. Those improved versions based on ℓ1-norm alleviate this problem, but they have other shortcomings, such as optimization difficulties or lack of rotational invariance, etc. Besides, existing methods only vaguely divide normal samples and outliers to improve robustness, but they ignore the fact that normal samples can be more specifically divided into positive samples and hard samples, which should have different contributions to the model because positive samples are more conducive to learning the projection matrix. In this paper, we propose a novel Data Subdivision Based Dual-Weighted Robust Principal Component Analysis, namely DRPCA, which firstly designs a mark vector to distinguish normal samples and outliers, and directly removes outliers according to mark weights. Moreover, we further divide normal samples into positive samples and hard samples by self-constrained weights, and place them in relative positions, so that the weight of positive samples is larger than hard samples, which makes the projection matrix more accurate. Additionally, the optimal mean is employed to obtain a more accurate data center. To solve this problem, we carefully design an effective iterative algorithm and analyze its convergence. Experiments on real-world and RGB large-scale datasets demonstrate the superiority of our method in dimensionality reduction and anomaly detection.
KW - Dual-weighted
KW - anomaly detection
KW - data subdivision
KW - dimensionality reduction
UR - http://www.scopus.com/inward/record.url?scp=85217950594&partnerID=8YFLogxK
U2 - 10.1109/TIP.2025.3536197
DO - 10.1109/TIP.2025.3536197
M3 - 文章
AN - SCOPUS:85217950594
SN - 1057-7149
VL - 34
SP - 1271
EP - 1284
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
ER -