Abstract
Traditional spectral clustering methods struggle with scalability and robustness in large datasets due to their reliance on similarity matrices and eigenvalue decomposition. We introduce two innovative models: Rcut-based Coordinate Descent Clustering (R-CDC) and Ncut-based Doubly Stochastic Clustering (N-DSC). These models integrate graph construction and segmentation into a unified process optimized through the coordinate descent method, significantly enhancing clustering efficacy. A novel graph structure enhances robustness against noise and outliers, simplifying the clustering process and improving outcomes across diverse datasets. Our extensive experiments show that these models surpass existing spectral clustering techniques in managing large-scale data and complex structures. The code can be found in https://github.com/happyduck-313/R-CDC-and-N-DSC.
Original language | English |
---|---|
Journal | IEEE Transactions on Knowledge and Data Engineering |
DOIs | |
State | Accepted/In press - 2025 |
Keywords
- clustering
- Coordinate descent method
- graph cut
- machine learning