TY - JOUR
T1 - Co-clustering ensemble based on bilateral k-means algorithm
AU - Yang, Hui
AU - Peng, Han
AU - Zhu, Jianyong
AU - Nie, Feiping
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2020
Y1 - 2020
N2 - Clustering ensemble technique has been shown to be effective in improving the accuracy and stability of single clustering algorithms. With the development of information technology, the amount of data, such as image, text and video, has increased rapidly. Efficiently clustering these large-scale datasets is a challenge. Clustering ensembles usually transform clustering results to a co-association matrix, and then to a graph-partition problem. These methods may suffer from information loss when computing the similarity among samples or base clusterings. Rich information between samples and base clusterings is ignored. Moreover, the results are not discrete. They need post-processing steps to obtain the final clustering result, which will deviate greatly from the real clustering result. To address this problem, we propose a co-clustering ensemble based on bilateral k-means (CEBKM) algorithm. Our algorithm can simultaneously cluster samples and base clusterings of a dataset, to fully exploit the potential information between the samples and the base clusterings. In addition, it can directly obtain the final clustering results without using other clustering algorithms. The proposed method, outperformed several state-of-the-art clustering ensemble methods in experiments conducted on real-world and toy datasets.
AB - Clustering ensemble technique has been shown to be effective in improving the accuracy and stability of single clustering algorithms. With the development of information technology, the amount of data, such as image, text and video, has increased rapidly. Efficiently clustering these large-scale datasets is a challenge. Clustering ensembles usually transform clustering results to a co-association matrix, and then to a graph-partition problem. These methods may suffer from information loss when computing the similarity among samples or base clusterings. Rich information between samples and base clusterings is ignored. Moreover, the results are not discrete. They need post-processing steps to obtain the final clustering result, which will deviate greatly from the real clustering result. To address this problem, we propose a co-clustering ensemble based on bilateral k-means (CEBKM) algorithm. Our algorithm can simultaneously cluster samples and base clusterings of a dataset, to fully exploit the potential information between the samples and the base clusterings. In addition, it can directly obtain the final clustering results without using other clustering algorithms. The proposed method, outperformed several state-of-the-art clustering ensemble methods in experiments conducted on real-world and toy datasets.
KW - base clustering
KW - bilateral k-means algorithm
KW - Clustering ensemble
KW - co-clustering
UR - http://www.scopus.com/inward/record.url?scp=85082517243&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2020.2979915
DO - 10.1109/ACCESS.2020.2979915
M3 - 文章
AN - SCOPUS:85082517243
SN - 2169-3536
VL - 8
SP - 51285
EP - 51294
JO - IEEE Access
JF - IEEE Access
M1 - 9032160
ER -