An ensemble classifier with random projection for predicting protein-protein interactions using sequence and evolutionary information

Xiao Yu Song, Zhan Heng Chen, Xiang Yang Sun, Zhu Hong You, Li Ping Li, Yang Zhao

Research output: Contribution to journalArticlepeer-review

25 Scopus citations

Abstract

Identifying protein-protein interactions (PPIs) is crucial to comprehend various biological processes in cells. Although high-throughput techniques generate many PPI data for various species, they are only a petty minority of the entire PPI network. Furthermore, these approaches are costly and time-consuming and have a high error rate. Therefore, it is necessary to design computational methods for efficiently detecting PPIs. In this study, a random projection ensemble classifier (RPEC) was explored to identify novel PPIs using evolutionary information contained in protein amino acid sequences. The evolutionary information was obtained from a position-specific scoring matrix (PSSM) generated from PSI-BLAST. A novel feature fusion scheme was then developed by combining discrete cosine transform (DCT), fast Fourier transform (FFT), and singular value decomposition (SVD). Finally, via the random projection ensemble classifier, the performance of the presented approach was evaluated on Yeast, Human, and H. pylori PPI datasets using 5-fold cross-validation. Our approach achieved high prediction accuracies of 95.64%, 96.59%, and 87.62%, respectively, effectively outperforming other existing methods. Generally speaking, our approach is quite promising and supplies a practical and effective method for predicting novel PPIs.

Original languageEnglish
Article number89
JournalApplied Sciences (Switzerland)
Volume8
Issue number1
DOIs
StatePublished - 10 Jan 2018
Externally publishedYes

Keywords

  • Position-specific scoring matrix
  • Protein-protein interactions
  • Random projection ensemble classifier
  • Support vector machine

Fingerprint

Dive into the research topics of 'An ensemble classifier with random projection for predicting protein-protein interactions using sequence and evolutionary information'. Together they form a unique fingerprint.

Cite this