A robust entropy regularized K-means clustering algorithm for processing noise in datasets

Peilin Jiang, Junnan Cao, Weizhong Yu, Feiping Nie

科研成果: 期刊稿件文章同行评审

摘要

K-means is one of the clustering algorithms. Due to its simple implementation and powerful functionality, it is widely used in fields such as data mining, cluster analysis, data preprocessing, and unsupervised learning. However, the K-means algorithm suffers from the problem of being sensitive to outliers. If there are a certain number of outliers in a low-dimensional sample set, the resulting cluster centers will be greatly disturbed, affecting the clustering results. We can certainly detect outliers before clustering, but this phased approach has an impact on the accuracy of clustering results. To address this issue, we propose an improved robust Entropy Regularized K-Means clustering algorithm. Our method is based on the Entropy Regularized K-Means clustering algorithm and adds a weight value to the optimization function to ignore out-of-bounds data, and obtain a more accurate number of clusters in the dataset, thereby achieving synchronous clustering and detection. The advantages of this algorithm are strong anti-interference ability, the ability to ignore the influence of outliers on cluster centers, and synchronous clustering and detection. We tested our improved algorithm on artificial and real datasets, demonstrating that it can better determine cluster centers and find some outlier data.

源语言英语
文章编号106518
页(从-至)6617-6632
页数16
期刊Neural Computing and Applications
37
9
DOI
出版状态已出版 - 3月 2025

指纹

探究 'A robust entropy regularized K-means clustering algorithm for processing noise in datasets' 的科研主题。它们共同构成独一无二的指纹。

引用此