Abstract
High dimension is one of the key characters of big data. Feature selection, as a framework to identify a small subset of illustrative and discriminative features, has been proved as a basic solution in dealing with high-dimensional data. In previous literatures, ℓ2, p-norm regularization was studied by many researches as an effective approach to select features across data sets with sparsity. However, ℓ2, p-norm loss function is just robust to noise but not considering the influence of outliers. In this paper, we propose a new robust and efficient feature selection method with emphasizing Simultaneous Capped ℓ2-norm loss and ℓ2, p-norm regularizer Minimization (SCM). The capped ℓ2-norm based loss function can effectively eliminate the influence of noise and outliers in regression and the ℓ2, p-norm regularization is used to select features across data sets with joint sparsity. An efficient approach is then introduced with proved convergence. Extensive experimental studies on synthetic and real-world datasets demonstrate the effectiveness of our method in comparison with other popular feature selection methods.
Original language | English |
---|---|
Pages (from-to) | 228-240 |
Number of pages | 13 |
Journal | Neurocomputing |
Volume | 283 |
DOIs | |
State | Published - 29 Mar 2018 |
Keywords
- Capped ℓ-norm loss
- Feature selection
- ℓ-norm regularization