Unified dual-label semi-supervised learning with top-k feature selection

Han Zhang; Maoguo Gong; Feiping Nie; Xuelong Li

doi:10.1016/j.neucom.2022.05.090

Unified dual-label semi-supervised learning with top-k feature selection

Han Zhang, Maoguo Gong, Feiping Nie, Xuelong Li

School of Artificial Intelligence, OPtics and Electronics

Research output: Contribution to journal › Article › peer-review

9 Scopus citations

Abstract

Semi-supervised feature selection alleviates the annotation burden of supervised feature learning by exploiting data under a handful of supervision information. The mainstream technique is to employ a linear regression framework that jointly learns labeled and unlabeled samples. However, existing approaches always encounter the deficiencies in two aspects: 1) the performance of models are severely degenerated once predicted labels are unreliable; 2) the balance of objectives in regards to two types of data are not well considered. In the article, we propose unified dual-label semi-supervised learning for top-k feature selection. The technique defines a soft label matrix to indicate the probability of samples belonging to each class. From the probability, the model could recognize unclassifiable samples that lay around the boundaries. Meanwhile, the label matrix is equipped with an exponent parameter γ. It endows the soft labels dual effects that the labeled and unlabeled data are tactfully discriminated. For the purpose of feature selection, we impose the ℓ_2,0-norm constraint on the projection matrix, such that the exact top-k features are picked out. An iteration algorithm is designed to solve the given problem, by which large-scale data are facilely tackled. We conduct experiments that validate the superiority of the proposed method against the state-of-the-art competitors.

Original language	English
Pages (from-to)	875-888
Number of pages	14
Journal	Neurocomputing
Volume	501
DOIs	https://doi.org/10.1016/j.neucom.2022.05.090
State	Published - 28 Aug 2022

Keywords

Dual-label matrix learning
Semi-supervised learning
Top-k feature selection
Unclassifiable sample recognition

Access to Document

10.1016/j.neucom.2022.05.090

Cite this

@article{61b846a3538648c08635987f6c78df08,

title = "Unified dual-label semi-supervised learning with top-k feature selection",

abstract = "Semi-supervised feature selection alleviates the annotation burden of supervised feature learning by exploiting data under a handful of supervision information. The mainstream technique is to employ a linear regression framework that jointly learns labeled and unlabeled samples. However, existing approaches always encounter the deficiencies in two aspects: 1) the performance of models are severely degenerated once predicted labels are unreliable; 2) the balance of objectives in regards to two types of data are not well considered. In the article, we propose unified dual-label semi-supervised learning for top-k feature selection. The technique defines a soft label matrix to indicate the probability of samples belonging to each class. From the probability, the model could recognize unclassifiable samples that lay around the boundaries. Meanwhile, the label matrix is equipped with an exponent parameter γ. It endows the soft labels dual effects that the labeled and unlabeled data are tactfully discriminated. For the purpose of feature selection, we impose the ℓ2,0-norm constraint on the projection matrix, such that the exact top-k features are picked out. An iteration algorithm is designed to solve the given problem, by which large-scale data are facilely tackled. We conduct experiments that validate the superiority of the proposed method against the state-of-the-art competitors.",

keywords = "Dual-label matrix learning, Semi-supervised learning, Top-k feature selection, Unclassifiable sample recognition",

author = "Han Zhang and Maoguo Gong and Feiping Nie and Xuelong Li",

note = "Publisher Copyright: {\textcopyright} 2022",

year = "2022",

month = aug,

day = "28",

doi = "10.1016/j.neucom.2022.05.090",

language = "英语",

volume = "501",

pages = "875--888",

journal = "Neurocomputing",

issn = "0925-2312",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Unified dual-label semi-supervised learning with top-k feature selection

AU - Zhang, Han

AU - Gong, Maoguo

AU - Nie, Feiping

AU - Li, Xuelong

PY - 2022/8/28

Y1 - 2022/8/28

N2 - Semi-supervised feature selection alleviates the annotation burden of supervised feature learning by exploiting data under a handful of supervision information. The mainstream technique is to employ a linear regression framework that jointly learns labeled and unlabeled samples. However, existing approaches always encounter the deficiencies in two aspects: 1) the performance of models are severely degenerated once predicted labels are unreliable; 2) the balance of objectives in regards to two types of data are not well considered. In the article, we propose unified dual-label semi-supervised learning for top-k feature selection. The technique defines a soft label matrix to indicate the probability of samples belonging to each class. From the probability, the model could recognize unclassifiable samples that lay around the boundaries. Meanwhile, the label matrix is equipped with an exponent parameter γ. It endows the soft labels dual effects that the labeled and unlabeled data are tactfully discriminated. For the purpose of feature selection, we impose the ℓ2,0-norm constraint on the projection matrix, such that the exact top-k features are picked out. An iteration algorithm is designed to solve the given problem, by which large-scale data are facilely tackled. We conduct experiments that validate the superiority of the proposed method against the state-of-the-art competitors.

AB - Semi-supervised feature selection alleviates the annotation burden of supervised feature learning by exploiting data under a handful of supervision information. The mainstream technique is to employ a linear regression framework that jointly learns labeled and unlabeled samples. However, existing approaches always encounter the deficiencies in two aspects: 1) the performance of models are severely degenerated once predicted labels are unreliable; 2) the balance of objectives in regards to two types of data are not well considered. In the article, we propose unified dual-label semi-supervised learning for top-k feature selection. The technique defines a soft label matrix to indicate the probability of samples belonging to each class. From the probability, the model could recognize unclassifiable samples that lay around the boundaries. Meanwhile, the label matrix is equipped with an exponent parameter γ. It endows the soft labels dual effects that the labeled and unlabeled data are tactfully discriminated. For the purpose of feature selection, we impose the ℓ2,0-norm constraint on the projection matrix, such that the exact top-k features are picked out. An iteration algorithm is designed to solve the given problem, by which large-scale data are facilely tackled. We conduct experiments that validate the superiority of the proposed method against the state-of-the-art competitors.

KW - Dual-label matrix learning

KW - Semi-supervised learning

KW - Top-k feature selection

KW - Unclassifiable sample recognition

UR - http://www.scopus.com/inward/record.url?scp=85134511733&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2022.05.090

DO - 10.1016/j.neucom.2022.05.090

M3 - 文章

AN - SCOPUS:85134511733

SN - 0925-2312

VL - 501

SP - 875

EP - 888

JO - Neurocomputing

JF - Neurocomputing

ER -

Unified dual-label semi-supervised learning with top-k feature selection

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this