TY - JOUR
T1 - Exploiting local coherent patterns for unsupervised feature ranking
AU - Huang, Qinghua
AU - Tao, Dacheng
AU - Li, Xuelong
AU - Jin, Lianwen
AU - Wei, Gang
PY - 2011/12
Y1 - 2011/12
N2 - Prior to pattern recognition, feature selection is often used to identify relevant features and discard irrelevant ones for obtaining improved analysis results. In this paper, we aim to develop an unsupervised feature ranking algorithm that evaluates features using discovered local coherent patterns, which are known as biclusters. The biclusters (viewed as submatrices) are discovered from a data matrix. These submatrices are used for scoring relevant features from two aspects, i.e., the interdependence of features and the separability of instances. The features are thereby ranked with respect to their accumulated scores from the total discovered biclusters before the pattern classification. Experimental results show that this proposed method can yield comparable or even better performance in comparison with the well-known Fisher score, Laplacian score, and variance score using three UCI data sets, well improve the results of gene expression data analysis using gene ontology annotation, and finally demonstrate its advantage of unsupervised feature ranking for high-dimensional data.
AB - Prior to pattern recognition, feature selection is often used to identify relevant features and discard irrelevant ones for obtaining improved analysis results. In this paper, we aim to develop an unsupervised feature ranking algorithm that evaluates features using discovered local coherent patterns, which are known as biclusters. The biclusters (viewed as submatrices) are discovered from a data matrix. These submatrices are used for scoring relevant features from two aspects, i.e., the interdependence of features and the separability of instances. The features are thereby ranked with respect to their accumulated scores from the total discovered biclusters before the pattern classification. Experimental results show that this proposed method can yield comparable or even better performance in comparison with the well-known Fisher score, Laplacian score, and variance score using three UCI data sets, well improve the results of gene expression data analysis using gene ontology annotation, and finally demonstrate its advantage of unsupervised feature ranking for high-dimensional data.
UR - http://www.scopus.com/inward/record.url?scp=81955163023&partnerID=8YFLogxK
U2 - 10.1109/TSMCB.2011.2151256
DO - 10.1109/TSMCB.2011.2151256
M3 - 文章
AN - SCOPUS:81955163023
SN - 1083-4419
VL - 41
SP - 1471
EP - 1482
JO - IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
JF - IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
IS - 6
M1 - 5887432
ER -