TY - JOUR
T1 - A robust self-training algorithm based on relative node graph
AU - Wang, Jikui
AU - Duan, Huiyu
AU - Zhang, Cuihong
AU - Nie, Feiping
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
PY - 2025/1
Y1 - 2025/1
N2 - Self-training algorithm is a well-known framework of semi-supervised learning. How to select high-confidence samples is the key step for self-training algorithm. If high-confidence examples with incorrect labels are employed to train the classifier, the error will get worse during iterations. To improve the quality of high-confidence samples, a novel data editing technique termed Relative Node Graph Editing (RNGE) is put forward. Say concretely, mass estimation is used to calculate the density and peak of each sample to build a prototype tree to reveal the underlying spatial structure of the data. Then, we define the Relative Node Graph (RNG) for each sample. Finally, the mislabeled samples in the candidate high-confidence sample set are identified by hypothesis test based on RNG. Combined above, we propose a Robust Self-training Algorithm based on Relative Node Graph (STRNG), which uses RNGE to identify mislabeled samples and edit them. The experimental results show that the proposed algorithm can improve the performance of the self-training algorithm.
AB - Self-training algorithm is a well-known framework of semi-supervised learning. How to select high-confidence samples is the key step for self-training algorithm. If high-confidence examples with incorrect labels are employed to train the classifier, the error will get worse during iterations. To improve the quality of high-confidence samples, a novel data editing technique termed Relative Node Graph Editing (RNGE) is put forward. Say concretely, mass estimation is used to calculate the density and peak of each sample to build a prototype tree to reveal the underlying spatial structure of the data. Then, we define the Relative Node Graph (RNG) for each sample. Finally, the mislabeled samples in the candidate high-confidence sample set are identified by hypothesis test based on RNG. Combined above, we propose a Robust Self-training Algorithm based on Relative Node Graph (STRNG), which uses RNGE to identify mislabeled samples and edit them. The experimental results show that the proposed algorithm can improve the performance of the self-training algorithm.
KW - Data editing
KW - High-confidence samples
KW - Self-training
KW - Semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85209386274&partnerID=8YFLogxK
U2 - 10.1007/s10489-024-06062-0
DO - 10.1007/s10489-024-06062-0
M3 - 文章
AN - SCOPUS:85209386274
SN - 0924-669X
VL - 55
JO - Applied Intelligence
JF - Applied Intelligence
IS - 1
M1 - 1
ER -