An improved graph entropy-based method for identifying protein complexes

Bolin Chen; Yan Yan; Jinhong Shi; Shenggui Zhang; Fang Xiang Wu

doi:10.1109/BIBM.2011.66

An improved graph entropy-based method for identifying protein complexes

Bolin Chen, Yan Yan, Jinhong Shi, Shenggui Zhang, Fang Xiang Wu

School of Mathematics and Statistics

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

8 Scopus citations

Abstract

Protein complexes are essential entities that perform the major cellular processes and biological functions in live organisms. The identification of component proteins in a complex from protein-protein interaction (PPI) networks is an important step to understand the organization and interaction of gene products. In existing literature, methods for identifying protein complexes typically start from a selected seed, commonly a vertex (a single protein), in a PPI network. However, in many circumstances, a single protein seed is not enough to generate a meaningful complex, or more than one protein is known in a complex. In this paper, we present an improved seed-growth style algorithm to identify protein complexes from PPI networks based on the concept of graph entropy. Different from existing methods, the seed is assumed to be a clique (e.g., a vertex, an edge, a triangle) in a PPI network. The computational experiments have been conducted on PPI network of S. cerevisiae. The results have shown that the larger cliques are considered as seeds, the better the presented method performs in terms of f-score. In particular, up to K3-cliques are included as seeds, the average f-score is 57.32%, which is better than that of existing methods.

Original language	English
Title of host publication	Proceedings - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011
Pages	123-126
Number of pages	4
DOIs	https://doi.org/10.1109/BIBM.2011.66
State	Published - 2011
Event	2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011 - Atlanta, GA, United States Duration: 12 Nov 2011 → 15 Nov 2011

Publication series

Name	Proceedings - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011

Conference

Conference	2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011
Country/Territory	United States
City	Atlanta, GA
Period	12/11/11 → 15/11/11

Keywords

graph clustering algorithm
graph entropy
protein complex
protein-protein interaction network

Access to Document

10.1109/BIBM.2011.66

Cite this

Chen, B., Yan, Y., Shi, J., Zhang, S., & Wu, F. X. (2011). An improved graph entropy-based method for identifying protein complexes. In Proceedings - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011 (pp. 123-126). Article 6120420 (Proceedings - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011). https://doi.org/10.1109/BIBM.2011.66

@inproceedings{16f1092fc311498d8445fa5fd4362f19,

title = "An improved graph entropy-based method for identifying protein complexes",

abstract = "Protein complexes are essential entities that perform the major cellular processes and biological functions in live organisms. The identification of component proteins in a complex from protein-protein interaction (PPI) networks is an important step to understand the organization and interaction of gene products. In existing literature, methods for identifying protein complexes typically start from a selected seed, commonly a vertex (a single protein), in a PPI network. However, in many circumstances, a single protein seed is not enough to generate a meaningful complex, or more than one protein is known in a complex. In this paper, we present an improved seed-growth style algorithm to identify protein complexes from PPI networks based on the concept of graph entropy. Different from existing methods, the seed is assumed to be a clique (e.g., a vertex, an edge, a triangle) in a PPI network. The computational experiments have been conducted on PPI network of S. cerevisiae. The results have shown that the larger cliques are considered as seeds, the better the presented method performs in terms of f-score. In particular, up to K3-cliques are included as seeds, the average f-score is 57.32%, which is better than that of existing methods.",

keywords = "graph clustering algorithm, graph entropy, protein complex, protein-protein interaction network",

author = "Bolin Chen and Yan Yan and Jinhong Shi and Shenggui Zhang and Wu, {Fang Xiang}",

year = "2011",

doi = "10.1109/BIBM.2011.66",

language = "英语",

isbn = "9780769545745",

series = "Proceedings - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011",

pages = "123--126",

booktitle = "Proceedings - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011",

note = "2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011 ; Conference date: 12-11-2011 Through 15-11-2011",

}

Chen, B, Yan, Y, Shi, J, Zhang, S & Wu, FX 2011, An improved graph entropy-based method for identifying protein complexes. in Proceedings - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011., 6120420, Proceedings - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011, pp. 123-126, 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011, Atlanta, GA, United States, 12/11/11. https://doi.org/10.1109/BIBM.2011.66

An improved graph entropy-based method for identifying protein complexes. / Chen, Bolin; Yan, Yan; Shi, Jinhong et al.
Proceedings - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011. 2011. p. 123-126 6120420 (Proceedings - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - An improved graph entropy-based method for identifying protein complexes

AU - Chen, Bolin

AU - Yan, Yan

AU - Shi, Jinhong

AU - Zhang, Shenggui

AU - Wu, Fang Xiang

PY - 2011

Y1 - 2011

N2 - Protein complexes are essential entities that perform the major cellular processes and biological functions in live organisms. The identification of component proteins in a complex from protein-protein interaction (PPI) networks is an important step to understand the organization and interaction of gene products. In existing literature, methods for identifying protein complexes typically start from a selected seed, commonly a vertex (a single protein), in a PPI network. However, in many circumstances, a single protein seed is not enough to generate a meaningful complex, or more than one protein is known in a complex. In this paper, we present an improved seed-growth style algorithm to identify protein complexes from PPI networks based on the concept of graph entropy. Different from existing methods, the seed is assumed to be a clique (e.g., a vertex, an edge, a triangle) in a PPI network. The computational experiments have been conducted on PPI network of S. cerevisiae. The results have shown that the larger cliques are considered as seeds, the better the presented method performs in terms of f-score. In particular, up to K3-cliques are included as seeds, the average f-score is 57.32%, which is better than that of existing methods.

AB - Protein complexes are essential entities that perform the major cellular processes and biological functions in live organisms. The identification of component proteins in a complex from protein-protein interaction (PPI) networks is an important step to understand the organization and interaction of gene products. In existing literature, methods for identifying protein complexes typically start from a selected seed, commonly a vertex (a single protein), in a PPI network. However, in many circumstances, a single protein seed is not enough to generate a meaningful complex, or more than one protein is known in a complex. In this paper, we present an improved seed-growth style algorithm to identify protein complexes from PPI networks based on the concept of graph entropy. Different from existing methods, the seed is assumed to be a clique (e.g., a vertex, an edge, a triangle) in a PPI network. The computational experiments have been conducted on PPI network of S. cerevisiae. The results have shown that the larger cliques are considered as seeds, the better the presented method performs in terms of f-score. In particular, up to K3-cliques are included as seeds, the average f-score is 57.32%, which is better than that of existing methods.

KW - graph clustering algorithm

KW - graph entropy

KW - protein complex

KW - protein-protein interaction network

UR - http://www.scopus.com/inward/record.url?scp=84862954998&partnerID=8YFLogxK

U2 - 10.1109/BIBM.2011.66

DO - 10.1109/BIBM.2011.66

M3 - 会议稿件

AN - SCOPUS:84862954998

SN - 9780769545745

T3 - Proceedings - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011

SP - 123

EP - 126

BT - Proceedings - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011

T2 - 2011 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2011

Y2 - 12 November 2011 through 15 November 2011

ER -

An improved graph entropy-based method for identifying protein complexes

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this