TY - JOUR
T1 - An Efficient Ensemble Learning Approach for Predicting Protein-Protein Interactions by Integrating Protein Primary Sequence and Evolutionary Information
AU - You, Zhu Hong
AU - Huang, Wen Zhun
AU - Zhang, Shanwen
AU - Huang, Yu An
AU - Yu, Chang Qing
AU - Li, Li Ping
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/5/1
Y1 - 2019/5/1
N2 - Protein-protein interactions (PPIs) perform a very important function in a number of cellular processes, including signal transduction, post-translational modifications, apoptosis, and cell growth. Deregulation of PPIs will lead to many diseases, including pernicious anemia or cancers. Although a large number of high-throughput techniques are designed to generate PPIs data, they are generally expensive, inefficient, and labor-intensive. Hence, there is an urgent need for developing a computational method to accurately and rapidly detect PPIs. In this article, we proposed a highly efficient method to detect PPIs by integrating a new protein sequence sub-stitution matrix feature representation and ensemble weighted sparse representation model classifier. The proposed method is demonstrated on Saccharomyces cerevisiae dataset and achieved 99.26 percent prediction accuracy with 98.53 percent sensitivity at precision of 100 percent, which is shown to have much higher predictive accuracy than the state-of-the-art methods. Extensive contrast experiments are performed with the benchmark data set from Human and Helicobacter pylori that our proposed method can achieve outstanding better success rates than other existing approaches in this problem. Experiment results illustrate that our proposed method presents an economical approach for computational building of PPI networks, which can be a helpful supplementary method for future proteomics researches.
AB - Protein-protein interactions (PPIs) perform a very important function in a number of cellular processes, including signal transduction, post-translational modifications, apoptosis, and cell growth. Deregulation of PPIs will lead to many diseases, including pernicious anemia or cancers. Although a large number of high-throughput techniques are designed to generate PPIs data, they are generally expensive, inefficient, and labor-intensive. Hence, there is an urgent need for developing a computational method to accurately and rapidly detect PPIs. In this article, we proposed a highly efficient method to detect PPIs by integrating a new protein sequence sub-stitution matrix feature representation and ensemble weighted sparse representation model classifier. The proposed method is demonstrated on Saccharomyces cerevisiae dataset and achieved 99.26 percent prediction accuracy with 98.53 percent sensitivity at precision of 100 percent, which is shown to have much higher predictive accuracy than the state-of-the-art methods. Extensive contrast experiments are performed with the benchmark data set from Human and Helicobacter pylori that our proposed method can achieve outstanding better success rates than other existing approaches in this problem. Experiment results illustrate that our proposed method presents an economical approach for computational building of PPI networks, which can be a helpful supplementary method for future proteomics researches.
KW - Ensemble learning
KW - evolutionary information
KW - protein sequence
KW - protein-protein interactions
UR - http://www.scopus.com/inward/record.url?scp=85058454984&partnerID=8YFLogxK
U2 - 10.1109/TCBB.2018.2882423
DO - 10.1109/TCBB.2018.2882423
M3 - 文章
AN - SCOPUS:85058454984
SN - 1545-5963
VL - 16
SP - 809
EP - 817
JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics
JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics
IS - 3
M1 - 8540898
ER -