Sparsity Fuzzy C-Means Clustering with Principal Component Analysis Embedding

Jingwei Chen; Jianyong Zhu; Hongyun Jiang; Hui Yang; Feiping Nie

doi:10.1109/TFUZZ.2022.3217343

Sparsity Fuzzy C-Means Clustering with Principal Component Analysis Embedding

Jingwei Chen, Jianyong Zhu, Hongyun Jiang, Hui Yang, Feiping Nie

School of Artificial Intelligence, OPtics and Electronics

Research output: Contribution to journal › Article › peer-review

16 Scopus citations

Abstract

The clustering method has been widely used in data mining, pattern recognition, and image identification. Fuzzy c-means (FCM) is a soft clustering method that introduces the concept of membership. In this method, the fuzzy membership matrix is obtained by calculating the distance between data points in the original space. However, these methods may yield suboptimal results owing to the influence of redundant features. Moreover, FCM is always sensitive to noise points and heavily subject to outliers. In this article, we propose a method called sparsity FCM clustering with principal component analysis embedding (P-SFCM). We simultaneously conduct principal component analysis and membership learning, and then add an additional weighting factor for each data point. The goal of this operation is to identify the noise or outliers. Overall, the benefit of our framework is that it retains most of the information in the subspace while improving the robustness of the noise. In this article, we employ an iterative optimization algorithm to efficiently solve our model. To verify the reliability of the proposed method, we conduct a convergence analysis, noise robustness analysis, and multicluster experiments. Furthermore, comparative experiments are conducted on both synthetic and real benchmark datasets. The experimental results show that the P-SFCM is competitive with comparable methods.

Original language	English
Pages (from-to)	2099-2111
Number of pages	13
Journal	IEEE Transactions on Fuzzy Systems
Volume	31
Issue number	7
DOIs	https://doi.org/10.1109/TFUZZ.2022.3217343
State	Published - 1 Jul 2023

Keywords

Clustering
dimensionality reduction
fuzzy c-means (FCM)
outliers
principal component analysis (PCA)
sparsity

Access to Document

10.1109/TFUZZ.2022.3217343

Cite this

@article{1eb0af911e994b8b829064fa3fdfeea7,

title = "Sparsity Fuzzy C-Means Clustering with Principal Component Analysis Embedding",

abstract = "The clustering method has been widely used in data mining, pattern recognition, and image identification. Fuzzy c-means (FCM) is a soft clustering method that introduces the concept of membership. In this method, the fuzzy membership matrix is obtained by calculating the distance between data points in the original space. However, these methods may yield suboptimal results owing to the influence of redundant features. Moreover, FCM is always sensitive to noise points and heavily subject to outliers. In this article, we propose a method called sparsity FCM clustering with principal component analysis embedding (P-SFCM). We simultaneously conduct principal component analysis and membership learning, and then add an additional weighting factor for each data point. The goal of this operation is to identify the noise or outliers. Overall, the benefit of our framework is that it retains most of the information in the subspace while improving the robustness of the noise. In this article, we employ an iterative optimization algorithm to efficiently solve our model. To verify the reliability of the proposed method, we conduct a convergence analysis, noise robustness analysis, and multicluster experiments. Furthermore, comparative experiments are conducted on both synthetic and real benchmark datasets. The experimental results show that the P-SFCM is competitive with comparable methods.",

keywords = "Clustering, dimensionality reduction, fuzzy c-means (FCM), outliers, principal component analysis (PCA), sparsity",

author = "Jingwei Chen and Jianyong Zhu and Hongyun Jiang and Hui Yang and Feiping Nie",

note = "Publisher Copyright: {\textcopyright} 1993-2012 IEEE.",

year = "2023",

month = jul,

day = "1",

doi = "10.1109/TFUZZ.2022.3217343",

language = "英语",

volume = "31",

pages = "2099--2111",

journal = "IEEE Transactions on Fuzzy Systems",

issn = "1063-6706",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "7",

}

TY - JOUR

T1 - Sparsity Fuzzy C-Means Clustering with Principal Component Analysis Embedding

AU - Chen, Jingwei

AU - Zhu, Jianyong

AU - Jiang, Hongyun

AU - Yang, Hui

AU - Nie, Feiping

PY - 2023/7/1

Y1 - 2023/7/1

N2 - The clustering method has been widely used in data mining, pattern recognition, and image identification. Fuzzy c-means (FCM) is a soft clustering method that introduces the concept of membership. In this method, the fuzzy membership matrix is obtained by calculating the distance between data points in the original space. However, these methods may yield suboptimal results owing to the influence of redundant features. Moreover, FCM is always sensitive to noise points and heavily subject to outliers. In this article, we propose a method called sparsity FCM clustering with principal component analysis embedding (P-SFCM). We simultaneously conduct principal component analysis and membership learning, and then add an additional weighting factor for each data point. The goal of this operation is to identify the noise or outliers. Overall, the benefit of our framework is that it retains most of the information in the subspace while improving the robustness of the noise. In this article, we employ an iterative optimization algorithm to efficiently solve our model. To verify the reliability of the proposed method, we conduct a convergence analysis, noise robustness analysis, and multicluster experiments. Furthermore, comparative experiments are conducted on both synthetic and real benchmark datasets. The experimental results show that the P-SFCM is competitive with comparable methods.

AB - The clustering method has been widely used in data mining, pattern recognition, and image identification. Fuzzy c-means (FCM) is a soft clustering method that introduces the concept of membership. In this method, the fuzzy membership matrix is obtained by calculating the distance between data points in the original space. However, these methods may yield suboptimal results owing to the influence of redundant features. Moreover, FCM is always sensitive to noise points and heavily subject to outliers. In this article, we propose a method called sparsity FCM clustering with principal component analysis embedding (P-SFCM). We simultaneously conduct principal component analysis and membership learning, and then add an additional weighting factor for each data point. The goal of this operation is to identify the noise or outliers. Overall, the benefit of our framework is that it retains most of the information in the subspace while improving the robustness of the noise. In this article, we employ an iterative optimization algorithm to efficiently solve our model. To verify the reliability of the proposed method, we conduct a convergence analysis, noise robustness analysis, and multicluster experiments. Furthermore, comparative experiments are conducted on both synthetic and real benchmark datasets. The experimental results show that the P-SFCM is competitive with comparable methods.

KW - Clustering

KW - dimensionality reduction

KW - fuzzy c-means (FCM)

KW - outliers

KW - principal component analysis (PCA)

KW - sparsity

UR - http://www.scopus.com/inward/record.url?scp=85141450830&partnerID=8YFLogxK

U2 - 10.1109/TFUZZ.2022.3217343

DO - 10.1109/TFUZZ.2022.3217343

M3 - 文章

AN - SCOPUS:85141450830

SN - 1063-6706

VL - 31

SP - 2099

EP - 2111

JO - IEEE Transactions on Fuzzy Systems

JF - IEEE Transactions on Fuzzy Systems

IS - 7

ER -

Sparsity Fuzzy C-Means Clustering with Principal Component Analysis Embedding

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this