TY - JOUR
T1 - A Convex Formulation for Fast Semi-Supervised Learning
AU - Fan, Xinyi
AU - Yu, Weizhong
AU - Nie, Feiping
AU - Miao, Zongcheng
AU - Li, Xuelong
N1 - Publisher Copyright:
© 1989-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - As a compromise between supervised and unsupervised learning, semi-supervised learning (SSL) harnesses both labeled and unlabeled data to enhance learning performance. Graph-based semi-supervised learning (GSSL) has emerged as a prominent approach owing to its versatility in representing sample interdependencies via graph structures. However, traditional GSSL methods face high time cost when computing matrix inverses, making them inefficient for large datasets. To address this, some researchers have introduced anchors as a bridge to accelerate the process. Nevertheless, most anchor-based models suffer from one or more of the following issues: (1) The anchor graph-based construction of the adjacency matrix has limitations; (2) The objective functions are typically non-convex, leading to local optima and requiring multiple runs to achieve good performance. To tackle these challenges, we develop a probability-driven approach to build the adjacency matrix, defining sample similarity as the probability of sharing the same anchor. Based on this strategy, we design a model (CFSL) with a strictly convex objective function, guaranteeing a globally optimal solution without iterative optimization. Experiments on multiple datasets indicate that our algorithm yields strong performance.
AB - As a compromise between supervised and unsupervised learning, semi-supervised learning (SSL) harnesses both labeled and unlabeled data to enhance learning performance. Graph-based semi-supervised learning (GSSL) has emerged as a prominent approach owing to its versatility in representing sample interdependencies via graph structures. However, traditional GSSL methods face high time cost when computing matrix inverses, making them inefficient for large datasets. To address this, some researchers have introduced anchors as a bridge to accelerate the process. Nevertheless, most anchor-based models suffer from one or more of the following issues: (1) The anchor graph-based construction of the adjacency matrix has limitations; (2) The objective functions are typically non-convex, leading to local optima and requiring multiple runs to achieve good performance. To tackle these challenges, we develop a probability-driven approach to build the adjacency matrix, defining sample similarity as the probability of sharing the same anchor. Based on this strategy, we design a model (CFSL) with a strictly convex objective function, guaranteeing a globally optimal solution without iterative optimization. Experiments on multiple datasets indicate that our algorithm yields strong performance.
KW - Semi-supervised learning
KW - anchor-based graph
KW - convex formulation
KW - large-scale data
UR - https://www.scopus.com/pages/publications/105017734867
U2 - 10.1109/TKDE.2025.3609886
DO - 10.1109/TKDE.2025.3609886
M3 - 文章
AN - SCOPUS:105017734867
SN - 1041-4347
VL - 37
SP - 6738
EP - 6749
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 12
ER -