TY - JOUR
T1 - Unified dual-label semi-supervised learning with top-k feature selection
AU - Zhang, Han
AU - Gong, Maoguo
AU - Nie, Feiping
AU - Li, Xuelong
N1 - Publisher Copyright:
© 2022
PY - 2022/8/28
Y1 - 2022/8/28
N2 - Semi-supervised feature selection alleviates the annotation burden of supervised feature learning by exploiting data under a handful of supervision information. The mainstream technique is to employ a linear regression framework that jointly learns labeled and unlabeled samples. However, existing approaches always encounter the deficiencies in two aspects: 1) the performance of models are severely degenerated once predicted labels are unreliable; 2) the balance of objectives in regards to two types of data are not well considered. In the article, we propose unified dual-label semi-supervised learning for top-k feature selection. The technique defines a soft label matrix to indicate the probability of samples belonging to each class. From the probability, the model could recognize unclassifiable samples that lay around the boundaries. Meanwhile, the label matrix is equipped with an exponent parameter γ. It endows the soft labels dual effects that the labeled and unlabeled data are tactfully discriminated. For the purpose of feature selection, we impose the ℓ2,0-norm constraint on the projection matrix, such that the exact top-k features are picked out. An iteration algorithm is designed to solve the given problem, by which large-scale data are facilely tackled. We conduct experiments that validate the superiority of the proposed method against the state-of-the-art competitors.
AB - Semi-supervised feature selection alleviates the annotation burden of supervised feature learning by exploiting data under a handful of supervision information. The mainstream technique is to employ a linear regression framework that jointly learns labeled and unlabeled samples. However, existing approaches always encounter the deficiencies in two aspects: 1) the performance of models are severely degenerated once predicted labels are unreliable; 2) the balance of objectives in regards to two types of data are not well considered. In the article, we propose unified dual-label semi-supervised learning for top-k feature selection. The technique defines a soft label matrix to indicate the probability of samples belonging to each class. From the probability, the model could recognize unclassifiable samples that lay around the boundaries. Meanwhile, the label matrix is equipped with an exponent parameter γ. It endows the soft labels dual effects that the labeled and unlabeled data are tactfully discriminated. For the purpose of feature selection, we impose the ℓ2,0-norm constraint on the projection matrix, such that the exact top-k features are picked out. An iteration algorithm is designed to solve the given problem, by which large-scale data are facilely tackled. We conduct experiments that validate the superiority of the proposed method against the state-of-the-art competitors.
KW - Dual-label matrix learning
KW - Semi-supervised learning
KW - Top-k feature selection
KW - Unclassifiable sample recognition
UR - http://www.scopus.com/inward/record.url?scp=85134511733&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2022.05.090
DO - 10.1016/j.neucom.2022.05.090
M3 - 文章
AN - SCOPUS:85134511733
SN - 0925-2312
VL - 501
SP - 875
EP - 888
JO - Neurocomputing
JF - Neurocomputing
ER -