TY - JOUR
T1 - Unsupervised Large Graph Embedding Based on Balanced and Hierarchical K-Means
AU - Nie, Feiping
AU - Zhu, Wei
AU - Li, Xuelong
N1 - Publisher Copyright:
© 1989-2012 IEEE.
PY - 2022/4/1
Y1 - 2022/4/1
N2 - There are many successful spectral based unsupervised dimensionality reduction methods, including Laplacian Eigenmap (LE), Locality Preserving Projection (LPP), Spectral Regression (SR), etc. We find that LPP and SR are equivalent if the symmetric similarity matrix is doubly stochastic, Positive Semi-Definite (PSD) and with rank pp, where pp is the reduced dimension. Since solving SR is believed faster than solving LPP based on some related literature, the discovery promotes us to seek to construct such specific similarity matrix to speed up LPP solving procedures. We then propose an unsupervised linear method called Unsupervised Large Graph Embedding (ULGE). ULGE starts with a similar idea as LPP but adopts an efficient approach to construct anchor-based similarity matrix and then performs spectral analysis on it. Moreover, since conventional anchor generation strategies suffer kinds of problems, we propose an efficient and effective anchor generation strategy, called Balanced KK-means based Hierarchical KK-means (BHKH). The computational complexity of ULGE can reduce to O(ndm)O(ndm), which is a significant improvement compared to conventional methods need O(n2d)O(n2d) at least, where nn, dd and mm are the number of samples, dimensions, and anchors, respectively. Extensive experiments on several publicly available datasets demonstrate the efficiency and effectiveness of the proposed method.
AB - There are many successful spectral based unsupervised dimensionality reduction methods, including Laplacian Eigenmap (LE), Locality Preserving Projection (LPP), Spectral Regression (SR), etc. We find that LPP and SR are equivalent if the symmetric similarity matrix is doubly stochastic, Positive Semi-Definite (PSD) and with rank pp, where pp is the reduced dimension. Since solving SR is believed faster than solving LPP based on some related literature, the discovery promotes us to seek to construct such specific similarity matrix to speed up LPP solving procedures. We then propose an unsupervised linear method called Unsupervised Large Graph Embedding (ULGE). ULGE starts with a similar idea as LPP but adopts an efficient approach to construct anchor-based similarity matrix and then performs spectral analysis on it. Moreover, since conventional anchor generation strategies suffer kinds of problems, we propose an efficient and effective anchor generation strategy, called Balanced KK-means based Hierarchical KK-means (BHKH). The computational complexity of ULGE can reduce to O(ndm)O(ndm), which is a significant improvement compared to conventional methods need O(n2d)O(n2d) at least, where nn, dd and mm are the number of samples, dimensions, and anchors, respectively. Extensive experiments on several publicly available datasets demonstrate the efficiency and effectiveness of the proposed method.
KW - Large graph embedding
KW - anchor based graph
KW - balanced K-means based hierarchical K-means
KW - dimensionality reduction
UR - http://www.scopus.com/inward/record.url?scp=85126559082&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2020.3000226
DO - 10.1109/TKDE.2020.3000226
M3 - 文章
AN - SCOPUS:85126559082
SN - 1041-4347
VL - 34
SP - 2008
EP - 2019
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 4
ER -