TY - JOUR
T1 - Fast Semisupervised Learning with Bipartite Graph for Large-Scale Data
AU - He, Fang
AU - Nie, Feiping
AU - Wang, Rong
AU - Li, Xuelong
AU - Jia, Weimin
N1 - Publisher Copyright:
© 2012 IEEE.
PY - 2020/2
Y1 - 2020/2
N2 - As the captured information in our real word is very scare and labeling sample is time cost and expensive, semisupervised learning (SSL) has an important application in computer vision and machine learning. Among SSL approaches, a graph-based SSL (GSSL) model has recently attracted much attention for high accuracy. However, for most traditional GSSL methods, the large-scale data bring higher computational complexity, which acquires a better computing platform. In order to dispose of these issues, we propose a novel approach, bipartite GSSL normalized (BGSSL-normalized) method, in this paper. This method consists of three parts. First, the bipartite graph between the original data and the anchor points is constructed, which is parameter-insensitive, scale-invariant, naturally sparse, and simple operation. Then, the label of the original data and anchors can be inferred through the graph. Besides, we extend our algorithm to handle out-of-sample for large-scale data by the inferred label of anchors, which not only retains good classification result but also saves a large amount of time. The computational complexity of BGSSL-normalized can be reduced to O(ndm+nm{2}) , which is a significant improvement compared with traditional GSSL methods that need O(n{2}d+n{3}) , where n , d, and m are the number of samples, features, and anchors, respectively. The experimental results on several publicly available data sets demonstrate that our approaches can achieve better classification accuracy with less time costs.
AB - As the captured information in our real word is very scare and labeling sample is time cost and expensive, semisupervised learning (SSL) has an important application in computer vision and machine learning. Among SSL approaches, a graph-based SSL (GSSL) model has recently attracted much attention for high accuracy. However, for most traditional GSSL methods, the large-scale data bring higher computational complexity, which acquires a better computing platform. In order to dispose of these issues, we propose a novel approach, bipartite GSSL normalized (BGSSL-normalized) method, in this paper. This method consists of three parts. First, the bipartite graph between the original data and the anchor points is constructed, which is parameter-insensitive, scale-invariant, naturally sparse, and simple operation. Then, the label of the original data and anchors can be inferred through the graph. Besides, we extend our algorithm to handle out-of-sample for large-scale data by the inferred label of anchors, which not only retains good classification result but also saves a large amount of time. The computational complexity of BGSSL-normalized can be reduced to O(ndm+nm{2}) , which is a significant improvement compared with traditional GSSL methods that need O(n{2}d+n{3}) , where n , d, and m are the number of samples, features, and anchors, respectively. The experimental results on several publicly available data sets demonstrate that our approaches can achieve better classification accuracy with less time costs.
KW - Bipartite graph
KW - large-scale data
KW - out-of-sample
KW - semisupervised learning (SSL)
UR - http://www.scopus.com/inward/record.url?scp=85079088728&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2019.2908504
DO - 10.1109/TNNLS.2019.2908504
M3 - 文章
C2 - 31107664
AN - SCOPUS:85079088728
SN - 2162-237X
VL - 31
SP - 626
EP - 638
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 2
M1 - 8718512
ER -