TY - GEN
T1 - An improved graph entropy-based method for identifying protein complexes
AU - Chen, Bolin
AU - Yan, Yan
AU - Shi, Jinhong
AU - Zhang, Shenggui
AU - Wu, Fang Xiang
PY - 2011
Y1 - 2011
N2 - Protein complexes are essential entities that perform the major cellular processes and biological functions in live organisms. The identification of component proteins in a complex from protein-protein interaction (PPI) networks is an important step to understand the organization and interaction of gene products. In existing literature, methods for identifying protein complexes typically start from a selected seed, commonly a vertex (a single protein), in a PPI network. However, in many circumstances, a single protein seed is not enough to generate a meaningful complex, or more than one protein is known in a complex. In this paper, we present an improved seed-growth style algorithm to identify protein complexes from PPI networks based on the concept of graph entropy. Different from existing methods, the seed is assumed to be a clique (e.g., a vertex, an edge, a triangle) in a PPI network. The computational experiments have been conducted on PPI network of S. cerevisiae. The results have shown that the larger cliques are considered as seeds, the better the presented method performs in terms of f-score. In particular, up to K3-cliques are included as seeds, the average f-score is 57.32%, which is better than that of existing methods.
AB - Protein complexes are essential entities that perform the major cellular processes and biological functions in live organisms. The identification of component proteins in a complex from protein-protein interaction (PPI) networks is an important step to understand the organization and interaction of gene products. In existing literature, methods for identifying protein complexes typically start from a selected seed, commonly a vertex (a single protein), in a PPI network. However, in many circumstances, a single protein seed is not enough to generate a meaningful complex, or more than one protein is known in a complex. In this paper, we present an improved seed-growth style algorithm to identify protein complexes from PPI networks based on the concept of graph entropy. Different from existing methods, the seed is assumed to be a clique (e.g., a vertex, an edge, a triangle) in a PPI network. The computational experiments have been conducted on PPI network of S. cerevisiae. The results have shown that the larger cliques are considered as seeds, the better the presented method performs in terms of f-score. In particular, up to K3-cliques are included as seeds, the average f-score is 57.32%, which is better than that of existing methods.
KW - graph clustering algorithm
KW - graph entropy
KW - protein complex
KW - protein-protein interaction network
UR - http://www.scopus.com/inward/record.url?scp=84862954998&partnerID=8YFLogxK
U2 - 10.1109/BIBM.2011.66
DO - 10.1109/BIBM.2011.66
M3 - 会议稿件
AN - SCOPUS:84862954998
SN - 9780769545745
T3 - Proceedings - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011
SP - 123
EP - 126
BT - Proceedings - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011
T2 - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011
Y2 - 12 November 2011 through 15 November 2011
ER -