An improved k-means algorithm based on evidence distance

Ailin Zhu; Zexi Hua; Yu Shi; Yongchuan Tang; Lingwei Miao

doi:10.3390/e23111550

An improved k-means algorithm based on evidence distance

Ailin Zhu, Zexi Hua, Yu Shi, Yongchuan Tang, Lingwei Miao

Research output: Contribution to journal › Article › peer-review

14 Scopus citations

Abstract

The main influencing factors of the clustering effect of the k-means algorithm are the selection of the initial clustering center and the distance measurement between the sample points. The traditional k-mean algorithm uses Euclidean distance to measure the distance between sample points, thus it suffers from low differentiation of attributes between sample points and is prone to local optimal solutions. For this feature, this paper proposes an improved k-means algorithm based on evidence distance. Firstly, the attribute values of sample points are modelled as the basic probability assignment (BPA) of sample points. Then, the traditional Euclidean distance is replaced by the evidence distance for measuring the distance between sample points, and finally k-means clustering is carried out using UCI data. Experimental comparisons are made with the traditional k-means algorithm, the k-means algorithm based on the aggregation distance parameter, and the Gaussian mixture model. The experimental results show that the improved k-means algorithm based on evidence distance proposed in this paper has a better clustering effect and the convergence of the algorithm is also better.

Original language	English
Article number	1550
Journal	Entropy
Volume	23
Issue number	11
DOIs	https://doi.org/10.3390/e23111550
State	Published - Nov 2021
Externally published	Yes

Keywords

Cluster analysis
Evidence distance
Evidence theory
K-means clustering

Access to Document

10.3390/e23111550

Cite this

@article{2162a88d8e854523b1032dec2eb52e7d,

title = "An improved k-means algorithm based on evidence distance",

abstract = "The main influencing factors of the clustering effect of the k-means algorithm are the selection of the initial clustering center and the distance measurement between the sample points. The traditional k-mean algorithm uses Euclidean distance to measure the distance between sample points, thus it suffers from low differentiation of attributes between sample points and is prone to local optimal solutions. For this feature, this paper proposes an improved k-means algorithm based on evidence distance. Firstly, the attribute values of sample points are modelled as the basic probability assignment (BPA) of sample points. Then, the traditional Euclidean distance is replaced by the evidence distance for measuring the distance between sample points, and finally k-means clustering is carried out using UCI data. Experimental comparisons are made with the traditional k-means algorithm, the k-means algorithm based on the aggregation distance parameter, and the Gaussian mixture model. The experimental results show that the improved k-means algorithm based on evidence distance proposed in this paper has a better clustering effect and the convergence of the algorithm is also better.",

keywords = "Cluster analysis, Evidence distance, Evidence theory, K-means clustering",

author = "Ailin Zhu and Zexi Hua and Yu Shi and Yongchuan Tang and Lingwei Miao",

note = "Publisher Copyright: {\textcopyright} 2021 by the authors. Licensee MDPI, Basel, Switzerland.",

year = "2021",

month = nov,

doi = "10.3390/e23111550",

language = "英语",

volume = "23",

journal = "Entropy",

issn = "1099-4300",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "11",

}

TY - JOUR

T1 - An improved k-means algorithm based on evidence distance

AU - Zhu, Ailin

AU - Hua, Zexi

AU - Shi, Yu

AU - Tang, Yongchuan

AU - Miao, Lingwei

PY - 2021/11

Y1 - 2021/11

N2 - The main influencing factors of the clustering effect of the k-means algorithm are the selection of the initial clustering center and the distance measurement between the sample points. The traditional k-mean algorithm uses Euclidean distance to measure the distance between sample points, thus it suffers from low differentiation of attributes between sample points and is prone to local optimal solutions. For this feature, this paper proposes an improved k-means algorithm based on evidence distance. Firstly, the attribute values of sample points are modelled as the basic probability assignment (BPA) of sample points. Then, the traditional Euclidean distance is replaced by the evidence distance for measuring the distance between sample points, and finally k-means clustering is carried out using UCI data. Experimental comparisons are made with the traditional k-means algorithm, the k-means algorithm based on the aggregation distance parameter, and the Gaussian mixture model. The experimental results show that the improved k-means algorithm based on evidence distance proposed in this paper has a better clustering effect and the convergence of the algorithm is also better.

AB - The main influencing factors of the clustering effect of the k-means algorithm are the selection of the initial clustering center and the distance measurement between the sample points. The traditional k-mean algorithm uses Euclidean distance to measure the distance between sample points, thus it suffers from low differentiation of attributes between sample points and is prone to local optimal solutions. For this feature, this paper proposes an improved k-means algorithm based on evidence distance. Firstly, the attribute values of sample points are modelled as the basic probability assignment (BPA) of sample points. Then, the traditional Euclidean distance is replaced by the evidence distance for measuring the distance between sample points, and finally k-means clustering is carried out using UCI data. Experimental comparisons are made with the traditional k-means algorithm, the k-means algorithm based on the aggregation distance parameter, and the Gaussian mixture model. The experimental results show that the improved k-means algorithm based on evidence distance proposed in this paper has a better clustering effect and the convergence of the algorithm is also better.

KW - Cluster analysis

KW - Evidence distance

KW - Evidence theory

KW - K-means clustering

UR - http://www.scopus.com/inward/record.url?scp=85119968604&partnerID=8YFLogxK

U2 - 10.3390/e23111550

DO - 10.3390/e23111550

M3 - 文章

AN - SCOPUS:85119968604

SN - 1099-4300

VL - 23

JO - Entropy

JF - Entropy

IS - 11

M1 - 1550

ER -

An improved k-means algorithm based on evidence distance

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this