An Efficient Ensemble Learning Approach for Predicting Protein-Protein Interactions by Integrating Protein Primary Sequence and Evolutionary Information

Zhu Hong You; Wen Zhun Huang; Shanwen Zhang; Yu An Huang; Chang Qing Yu; Li Ping Li

doi:10.1109/TCBB.2018.2882423

An Efficient Ensemble Learning Approach for Predicting Protein-Protein Interactions by Integrating Protein Primary Sequence and Evolutionary Information

Zhu Hong You, Wen Zhun Huang, Shanwen Zhang, Yu An Huang, Chang Qing Yu, Li Ping Li

科研成果: 期刊稿件 › 文章 › 同行评审

26 引用（Scopus）

摘要

Protein-protein interactions (PPIs) perform a very important function in a number of cellular processes, including signal transduction, post-translational modifications, apoptosis, and cell growth. Deregulation of PPIs will lead to many diseases, including pernicious anemia or cancers. Although a large number of high-throughput techniques are designed to generate PPIs data, they are generally expensive, inefficient, and labor-intensive. Hence, there is an urgent need for developing a computational method to accurately and rapidly detect PPIs. In this article, we proposed a highly efficient method to detect PPIs by integrating a new protein sequence sub-stitution matrix feature representation and ensemble weighted sparse representation model classifier. The proposed method is demonstrated on Saccharomyces cerevisiae dataset and achieved 99.26 percent prediction accuracy with 98.53 percent sensitivity at precision of 100 percent, which is shown to have much higher predictive accuracy than the state-of-the-art methods. Extensive contrast experiments are performed with the benchmark data set from Human and Helicobacter pylori that our proposed method can achieve outstanding better success rates than other existing approaches in this problem. Experiment results illustrate that our proposed method presents an economical approach for computational building of PPI networks, which can be a helpful supplementary method for future proteomics researches.

源语言	英语
文章编号	8540898
页（从-至）	809-817
页数	9
期刊	IEEE/ACM Transactions on Computational Biology and Bioinformatics
卷	16
期	3
DOI	https://doi.org/10.1109/TCBB.2018.2882423
出版状态	已出版 - 1 5月 2019
已对外发布	是

联合国可持续发展目标

此成果有助于实现下列可持续发展目标：

访问文件

10.1109/TCBB.2018.2882423

其它文件与链接

链接到 Scopus 的出版物

引用此

You, Z. H., Huang, W. Z., Zhang, S., Huang, Y. A., Yu, C. Q., & Li, L. P. (2019). An Efficient Ensemble Learning Approach for Predicting Protein-Protein Interactions by Integrating Protein Primary Sequence and Evolutionary Information. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 16(3), 809-817. 文章 8540898. https://doi.org/10.1109/TCBB.2018.2882423

@article{b04c5e75e80c444ead77f7b75301df2a,

title = "An Efficient Ensemble Learning Approach for Predicting Protein-Protein Interactions by Integrating Protein Primary Sequence and Evolutionary Information",

abstract = "Protein-protein interactions (PPIs) perform a very important function in a number of cellular processes, including signal transduction, post-translational modifications, apoptosis, and cell growth. Deregulation of PPIs will lead to many diseases, including pernicious anemia or cancers. Although a large number of high-throughput techniques are designed to generate PPIs data, they are generally expensive, inefficient, and labor-intensive. Hence, there is an urgent need for developing a computational method to accurately and rapidly detect PPIs. In this article, we proposed a highly efficient method to detect PPIs by integrating a new protein sequence sub-stitution matrix feature representation and ensemble weighted sparse representation model classifier. The proposed method is demonstrated on Saccharomyces cerevisiae dataset and achieved 99.26 percent prediction accuracy with 98.53 percent sensitivity at precision of 100 percent, which is shown to have much higher predictive accuracy than the state-of-the-art methods. Extensive contrast experiments are performed with the benchmark data set from Human and Helicobacter pylori that our proposed method can achieve outstanding better success rates than other existing approaches in this problem. Experiment results illustrate that our proposed method presents an economical approach for computational building of PPI networks, which can be a helpful supplementary method for future proteomics researches.",

keywords = "Ensemble learning, evolutionary information, protein sequence, protein-protein interactions",

author = "You, {Zhu Hong} and Huang, {Wen Zhun} and Shanwen Zhang and Huang, {Yu An} and Yu, {Chang Qing} and Li, {Li Ping}",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE.",

year = "2019",

month = may,

day = "1",

doi = "10.1109/TCBB.2018.2882423",

language = "英语",

volume = "16",

pages = "809--817",

journal = "IEEE/ACM Transactions on Computational Biology and Bioinformatics",

issn = "1545-5963",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "3",

}

You, ZH, Huang, WZ, Zhang, S, Huang, YA, Yu, CQ & Li, LP 2019, 'An Efficient Ensemble Learning Approach for Predicting Protein-Protein Interactions by Integrating Protein Primary Sequence and Evolutionary Information', IEEE/ACM Transactions on Computational Biology and Bioinformatics, 卷 16, 号码 3, 8540898, 页码 809-817. https://doi.org/10.1109/TCBB.2018.2882423

An Efficient Ensemble Learning Approach for Predicting Protein-Protein Interactions by Integrating Protein Primary Sequence and Evolutionary Information. / You, Zhu Hong; Huang, Wen Zhun; Zhang, Shanwen 等.
在: IEEE/ACM Transactions on Computational Biology and Bioinformatics, 卷 16, 号码 3, 8540898, 01.05.2019, 页码 809-817.

科研成果: 期刊稿件 › 文章 › 同行评审

TY - JOUR

T1 - An Efficient Ensemble Learning Approach for Predicting Protein-Protein Interactions by Integrating Protein Primary Sequence and Evolutionary Information

AU - You, Zhu Hong

AU - Huang, Wen Zhun

AU - Zhang, Shanwen

AU - Huang, Yu An

AU - Yu, Chang Qing

AU - Li, Li Ping

PY - 2019/5/1

Y1 - 2019/5/1

N2 - Protein-protein interactions (PPIs) perform a very important function in a number of cellular processes, including signal transduction, post-translational modifications, apoptosis, and cell growth. Deregulation of PPIs will lead to many diseases, including pernicious anemia or cancers. Although a large number of high-throughput techniques are designed to generate PPIs data, they are generally expensive, inefficient, and labor-intensive. Hence, there is an urgent need for developing a computational method to accurately and rapidly detect PPIs. In this article, we proposed a highly efficient method to detect PPIs by integrating a new protein sequence sub-stitution matrix feature representation and ensemble weighted sparse representation model classifier. The proposed method is demonstrated on Saccharomyces cerevisiae dataset and achieved 99.26 percent prediction accuracy with 98.53 percent sensitivity at precision of 100 percent, which is shown to have much higher predictive accuracy than the state-of-the-art methods. Extensive contrast experiments are performed with the benchmark data set from Human and Helicobacter pylori that our proposed method can achieve outstanding better success rates than other existing approaches in this problem. Experiment results illustrate that our proposed method presents an economical approach for computational building of PPI networks, which can be a helpful supplementary method for future proteomics researches.

AB - Protein-protein interactions (PPIs) perform a very important function in a number of cellular processes, including signal transduction, post-translational modifications, apoptosis, and cell growth. Deregulation of PPIs will lead to many diseases, including pernicious anemia or cancers. Although a large number of high-throughput techniques are designed to generate PPIs data, they are generally expensive, inefficient, and labor-intensive. Hence, there is an urgent need for developing a computational method to accurately and rapidly detect PPIs. In this article, we proposed a highly efficient method to detect PPIs by integrating a new protein sequence sub-stitution matrix feature representation and ensemble weighted sparse representation model classifier. The proposed method is demonstrated on Saccharomyces cerevisiae dataset and achieved 99.26 percent prediction accuracy with 98.53 percent sensitivity at precision of 100 percent, which is shown to have much higher predictive accuracy than the state-of-the-art methods. Extensive contrast experiments are performed with the benchmark data set from Human and Helicobacter pylori that our proposed method can achieve outstanding better success rates than other existing approaches in this problem. Experiment results illustrate that our proposed method presents an economical approach for computational building of PPI networks, which can be a helpful supplementary method for future proteomics researches.

KW - Ensemble learning

KW - evolutionary information

KW - protein sequence

KW - protein-protein interactions

UR - http://www.scopus.com/inward/record.url?scp=85058454984&partnerID=8YFLogxK

U2 - 10.1109/TCBB.2018.2882423

DO - 10.1109/TCBB.2018.2882423

M3 - 文章

AN - SCOPUS:85058454984

SN - 1545-5963

VL - 16

SP - 809

EP - 817

JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics

JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics

IS - 3

M1 - 8540898

ER -

An Efficient Ensemble Learning Approach for Predicting Protein-Protein Interactions by Integrating Protein Primary Sequence and Evolutionary Information

摘要

联合国可持续发展目标

访问文件

其它文件与链接

指纹

引用此