TY - JOUR
T1 - A Local and global discriminative framework and optimization for balanced clustering
AU - Han, Junwei
AU - Liu, Hanyang
AU - Nie, Feiping
N1 - Publisher Copyright:
© 2012 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - For many specific applications in data mining and machine learning, we face explicit or latent size constraint for each cluster that leads to the 'balanced clustering' problem. Many existing clustering algorithms perform well in partitioning but fail in producing balanced clusters and preserving the naturally balanced structure of some data. In this paper, we propose a novel balanced clustering framework that flexibly utilizes local and global information of data. First, we propose the global balanced clustering (GBC), in which a global discriminative partitioning model is combined with the minimization of the distribution entropy of data. Then, we show that the proposed GBC can be further used to globally regularize some widely used local clustering models, so as to transform them into balanced clustering that simultaneously capture local and global data. We apply our global balanced regularization to spectral clustering (SC) and local learning (LL)-based clustering, respectively, and propose another two novel balanced clustering models: The local and global balanced SC (LGB-SC) and LGB-LL. Finding the optimal balanced partition is nondeterministic polynomial-Time (NP)-hard in general. We adopt the method of augmented Lagrange multipliers to help optimize our model. Comprehensive experiments on several real world benchmarks demonstrate the advantage of our framework to yield balanced clusters while preserving good clustering quality. Our proposed LGB-SC and LGB-LL also outperform SC and LL as well as other classical clustering methods.
AB - For many specific applications in data mining and machine learning, we face explicit or latent size constraint for each cluster that leads to the 'balanced clustering' problem. Many existing clustering algorithms perform well in partitioning but fail in producing balanced clusters and preserving the naturally balanced structure of some data. In this paper, we propose a novel balanced clustering framework that flexibly utilizes local and global information of data. First, we propose the global balanced clustering (GBC), in which a global discriminative partitioning model is combined with the minimization of the distribution entropy of data. Then, we show that the proposed GBC can be further used to globally regularize some widely used local clustering models, so as to transform them into balanced clustering that simultaneously capture local and global data. We apply our global balanced regularization to spectral clustering (SC) and local learning (LL)-based clustering, respectively, and propose another two novel balanced clustering models: The local and global balanced SC (LGB-SC) and LGB-LL. Finding the optimal balanced partition is nondeterministic polynomial-Time (NP)-hard in general. We adopt the method of augmented Lagrange multipliers to help optimize our model. Comprehensive experiments on several real world benchmarks demonstrate the advantage of our framework to yield balanced clusters while preserving good clustering quality. Our proposed LGB-SC and LGB-LL also outperform SC and LL as well as other classical clustering methods.
KW - Balanced clustering
KW - linear regression
KW - local learning (LL)
KW - spectral clustering (SC)
UR - http://www.scopus.com/inward/record.url?scp=85055033747&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2018.2870131
DO - 10.1109/TNNLS.2018.2870131
M3 - 文章
C2 - 30334771
AN - SCOPUS:85055033747
SN - 2162-237X
VL - 30
SP - 3059
EP - 3071
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 10
M1 - 8490741
ER -