Co-clustering ensemble based on bilateral k-means algorithm

Hui Yang; Han Peng; Jianyong Zhu; Feiping Nie

doi:10.1109/ACCESS.2020.2979915

Co-clustering ensemble based on bilateral k-means algorithm

Hui Yang, Han Peng, Jianyong Zhu, Feiping Nie

School of Artificial Intelligence, OPtics and Electronics

Research output: Contribution to journal › Article › peer-review

14 Scopus citations

Abstract

Clustering ensemble technique has been shown to be effective in improving the accuracy and stability of single clustering algorithms. With the development of information technology, the amount of data, such as image, text and video, has increased rapidly. Efficiently clustering these large-scale datasets is a challenge. Clustering ensembles usually transform clustering results to a co-association matrix, and then to a graph-partition problem. These methods may suffer from information loss when computing the similarity among samples or base clusterings. Rich information between samples and base clusterings is ignored. Moreover, the results are not discrete. They need post-processing steps to obtain the final clustering result, which will deviate greatly from the real clustering result. To address this problem, we propose a co-clustering ensemble based on bilateral k-means (CEBKM) algorithm. Our algorithm can simultaneously cluster samples and base clusterings of a dataset, to fully exploit the potential information between the samples and the base clusterings. In addition, it can directly obtain the final clustering results without using other clustering algorithms. The proposed method, outperformed several state-of-the-art clustering ensemble methods in experiments conducted on real-world and toy datasets.

Original language	English
Article number	9032160
Pages (from-to)	51285-51294
Number of pages	10
Journal	IEEE Access
Volume	8
DOIs	https://doi.org/10.1109/ACCESS.2020.2979915
State	Published - 2020

Keywords

base clustering
bilateral k-means algorithm
Clustering ensemble
co-clustering

Access to Document

10.1109/ACCESS.2020.2979915

Cite this

@article{d00389363a534adab11c952fe17457d6,

title = "Co-clustering ensemble based on bilateral k-means algorithm",

abstract = "Clustering ensemble technique has been shown to be effective in improving the accuracy and stability of single clustering algorithms. With the development of information technology, the amount of data, such as image, text and video, has increased rapidly. Efficiently clustering these large-scale datasets is a challenge. Clustering ensembles usually transform clustering results to a co-association matrix, and then to a graph-partition problem. These methods may suffer from information loss when computing the similarity among samples or base clusterings. Rich information between samples and base clusterings is ignored. Moreover, the results are not discrete. They need post-processing steps to obtain the final clustering result, which will deviate greatly from the real clustering result. To address this problem, we propose a co-clustering ensemble based on bilateral k-means (CEBKM) algorithm. Our algorithm can simultaneously cluster samples and base clusterings of a dataset, to fully exploit the potential information between the samples and the base clusterings. In addition, it can directly obtain the final clustering results without using other clustering algorithms. The proposed method, outperformed several state-of-the-art clustering ensemble methods in experiments conducted on real-world and toy datasets.",

keywords = "base clustering, bilateral k-means algorithm, Clustering ensemble, co-clustering",

author = "Hui Yang and Han Peng and Jianyong Zhu and Feiping Nie",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2020",

doi = "10.1109/ACCESS.2020.2979915",

language = "英语",

volume = "8",

pages = "51285--51294",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Co-clustering ensemble based on bilateral k-means algorithm

AU - Yang, Hui

AU - Peng, Han

AU - Zhu, Jianyong

AU - Nie, Feiping

PY - 2020

Y1 - 2020

N2 - Clustering ensemble technique has been shown to be effective in improving the accuracy and stability of single clustering algorithms. With the development of information technology, the amount of data, such as image, text and video, has increased rapidly. Efficiently clustering these large-scale datasets is a challenge. Clustering ensembles usually transform clustering results to a co-association matrix, and then to a graph-partition problem. These methods may suffer from information loss when computing the similarity among samples or base clusterings. Rich information between samples and base clusterings is ignored. Moreover, the results are not discrete. They need post-processing steps to obtain the final clustering result, which will deviate greatly from the real clustering result. To address this problem, we propose a co-clustering ensemble based on bilateral k-means (CEBKM) algorithm. Our algorithm can simultaneously cluster samples and base clusterings of a dataset, to fully exploit the potential information between the samples and the base clusterings. In addition, it can directly obtain the final clustering results without using other clustering algorithms. The proposed method, outperformed several state-of-the-art clustering ensemble methods in experiments conducted on real-world and toy datasets.

AB - Clustering ensemble technique has been shown to be effective in improving the accuracy and stability of single clustering algorithms. With the development of information technology, the amount of data, such as image, text and video, has increased rapidly. Efficiently clustering these large-scale datasets is a challenge. Clustering ensembles usually transform clustering results to a co-association matrix, and then to a graph-partition problem. These methods may suffer from information loss when computing the similarity among samples or base clusterings. Rich information between samples and base clusterings is ignored. Moreover, the results are not discrete. They need post-processing steps to obtain the final clustering result, which will deviate greatly from the real clustering result. To address this problem, we propose a co-clustering ensemble based on bilateral k-means (CEBKM) algorithm. Our algorithm can simultaneously cluster samples and base clusterings of a dataset, to fully exploit the potential information between the samples and the base clusterings. In addition, it can directly obtain the final clustering results without using other clustering algorithms. The proposed method, outperformed several state-of-the-art clustering ensemble methods in experiments conducted on real-world and toy datasets.

KW - base clustering

KW - bilateral k-means algorithm

KW - Clustering ensemble

KW - co-clustering

UR - http://www.scopus.com/inward/record.url?scp=85082517243&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2020.2979915

DO - 10.1109/ACCESS.2020.2979915

M3 - 文章

AN - SCOPUS:85082517243

SN - 2169-3536

VL - 8

SP - 51285

EP - 51294

JO - IEEE Access

JF - IEEE Access

M1 - 9032160

ER -

Co-clustering ensemble based on bilateral k-means algorithm

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this