A robust self-training algorithm based on relative node graph

Jikui Wang; Huiyu Duan; Cuihong Zhang; Feiping Nie

doi:10.1007/s10489-024-06062-0

A robust self-training algorithm based on relative node graph

Jikui Wang, Huiyu Duan, Cuihong Zhang, Feiping Nie

School of Artificial Intelligence, OPtics and Electronics

Lanzhou University of Finance and Economics

Research output: Contribution to journal › Article › peer-review

Abstract

Self-training algorithm is a well-known framework of semi-supervised learning. How to select high-confidence samples is the key step for self-training algorithm. If high-confidence examples with incorrect labels are employed to train the classifier, the error will get worse during iterations. To improve the quality of high-confidence samples, a novel data editing technique termed Relative Node Graph Editing (RNGE) is put forward. Say concretely, mass estimation is used to calculate the density and peak of each sample to build a prototype tree to reveal the underlying spatial structure of the data. Then, we define the Relative Node Graph (RNG) for each sample. Finally, the mislabeled samples in the candidate high-confidence sample set are identified by hypothesis test based on RNG. Combined above, we propose a Robust Self-training Algorithm based on Relative Node Graph (STRNG), which uses RNGE to identify mislabeled samples and edit them. The experimental results show that the proposed algorithm can improve the performance of the self-training algorithm.

Original language	English
Article number	1
Journal	Applied Intelligence
Volume	55
Issue number	1
DOIs	https://doi.org/10.1007/s10489-024-06062-0
State	Published - Jan 2025

Keywords

Data editing
High-confidence samples
Self-training
Semi-supervised learning

Access to Document

10.1007/s10489-024-06062-0

Cite this

@article{64c7678c145c4436ba501eccc1fe3cf9,

title = "A robust self-training algorithm based on relative node graph",

abstract = "Self-training algorithm is a well-known framework of semi-supervised learning. How to select high-confidence samples is the key step for self-training algorithm. If high-confidence examples with incorrect labels are employed to train the classifier, the error will get worse during iterations. To improve the quality of high-confidence samples, a novel data editing technique termed Relative Node Graph Editing (RNGE) is put forward. Say concretely, mass estimation is used to calculate the density and peak of each sample to build a prototype tree to reveal the underlying spatial structure of the data. Then, we define the Relative Node Graph (RNG) for each sample. Finally, the mislabeled samples in the candidate high-confidence sample set are identified by hypothesis test based on RNG. Combined above, we propose a Robust Self-training Algorithm based on Relative Node Graph (STRNG), which uses RNGE to identify mislabeled samples and edit them. The experimental results show that the proposed algorithm can improve the performance of the self-training algorithm.",

keywords = "Data editing, High-confidence samples, Self-training, Semi-supervised learning",

author = "Jikui Wang and Huiyu Duan and Cuihong Zhang and Feiping Nie",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.",

year = "2025",

month = jan,

doi = "10.1007/s10489-024-06062-0",

language = "英语",

volume = "55",

journal = "Applied Intelligence",

issn = "0924-669X",

publisher = "Springer Netherlands",

number = "1",

}

TY - JOUR

T1 - A robust self-training algorithm based on relative node graph

AU - Wang, Jikui

AU - Duan, Huiyu

AU - Zhang, Cuihong

AU - Nie, Feiping

N1 - Publisher Copyright: © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.

PY - 2025/1

Y1 - 2025/1

N2 - Self-training algorithm is a well-known framework of semi-supervised learning. How to select high-confidence samples is the key step for self-training algorithm. If high-confidence examples with incorrect labels are employed to train the classifier, the error will get worse during iterations. To improve the quality of high-confidence samples, a novel data editing technique termed Relative Node Graph Editing (RNGE) is put forward. Say concretely, mass estimation is used to calculate the density and peak of each sample to build a prototype tree to reveal the underlying spatial structure of the data. Then, we define the Relative Node Graph (RNG) for each sample. Finally, the mislabeled samples in the candidate high-confidence sample set are identified by hypothesis test based on RNG. Combined above, we propose a Robust Self-training Algorithm based on Relative Node Graph (STRNG), which uses RNGE to identify mislabeled samples and edit them. The experimental results show that the proposed algorithm can improve the performance of the self-training algorithm.

AB - Self-training algorithm is a well-known framework of semi-supervised learning. How to select high-confidence samples is the key step for self-training algorithm. If high-confidence examples with incorrect labels are employed to train the classifier, the error will get worse during iterations. To improve the quality of high-confidence samples, a novel data editing technique termed Relative Node Graph Editing (RNGE) is put forward. Say concretely, mass estimation is used to calculate the density and peak of each sample to build a prototype tree to reveal the underlying spatial structure of the data. Then, we define the Relative Node Graph (RNG) for each sample. Finally, the mislabeled samples in the candidate high-confidence sample set are identified by hypothesis test based on RNG. Combined above, we propose a Robust Self-training Algorithm based on Relative Node Graph (STRNG), which uses RNGE to identify mislabeled samples and edit them. The experimental results show that the proposed algorithm can improve the performance of the self-training algorithm.

KW - Data editing

KW - High-confidence samples

KW - Self-training

KW - Semi-supervised learning

UR - http://www.scopus.com/inward/record.url?scp=85209386274&partnerID=8YFLogxK

U2 - 10.1007/s10489-024-06062-0

DO - 10.1007/s10489-024-06062-0

M3 - 文章

AN - SCOPUS:85209386274

SN - 0924-669X

VL - 55

JO - Applied Intelligence

JF - Applied Intelligence

IS - 1

M1 - 1

ER -

A robust self-training algorithm based on relative node graph

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this