摘要
High dimension is one of the key characters of big data. Feature selection, as a framework to identify a small subset of illustrative and discriminative features, has been proved as a basic solution in dealing with high-dimensional data. In previous literatures, ℓ2, p-norm regularization was studied by many researches as an effective approach to select features across data sets with sparsity. However, ℓ2, p-norm loss function is just robust to noise but not considering the influence of outliers. In this paper, we propose a new robust and efficient feature selection method with emphasizing Simultaneous Capped ℓ2-norm loss and ℓ2, p-norm regularizer Minimization (SCM). The capped ℓ2-norm based loss function can effectively eliminate the influence of noise and outliers in regression and the ℓ2, p-norm regularization is used to select features across data sets with joint sparsity. An efficient approach is then introduced with proved convergence. Extensive experimental studies on synthetic and real-world datasets demonstrate the effectiveness of our method in comparison with other popular feature selection methods.
源语言 | 英语 |
---|---|
页(从-至) | 228-240 |
页数 | 13 |
期刊 | Neurocomputing |
卷 | 283 |
DOI | |
出版状态 | 已出版 - 29 3月 2018 |