TY - JOUR
T1 - Predicting protein-protein interactions from protein sequences by a stacked sparse autoencoder deep neural network
AU - Wang, Yan Bin
AU - You, Zhu Hong
AU - Li, Xiao
AU - Jiang, Tong Hai
AU - Chen, Xing
AU - Zhou, Xi
AU - Wang, Lei
N1 - Publisher Copyright:
© 2017 The Royal Society of Chemistry.
PY - 2017
Y1 - 2017
N2 - Protein-protein interactions (PPIs) play an important role in most of the biological processes. How to correctly and efficiently detect protein interaction is a problem that is worth studying. Although high-throughput technologies provide the possibility to detect large-scale PPIs, these cannot be used to detect whole PPIs, and unreliable data may be generated. To solve this problem, in this study, a novel computational method was proposed to effectively predict the PPIs using the information of a protein sequence. The present method adopts Zernike moments to extract the protein sequence feature from a position specific scoring matrix (PSSM). Then, these extracted features were reconstructed using the stacked autoencoder. Finally, a novel probabilistic classification vector machine (PCVM) classifier was employed to predict the protein-protein interactions. When performed on the PPIs datasets of Yeast and H. pylori, the proposed method could achieve average accuracies of 96.60% and 91.19%, respectively. The promising result shows that the proposed method has a better ability to detect PPIs than other detection methods. The proposed method was also applied to predict PPIs on other species, and promising results were obtained. To evaluate the ability of our method, we compared it with the-state-of-the-art support vector machine (SVM) classifier for the Yeast dataset. The results obtained via multiple experiments prove that our method is powerful, efficient, feasible, and make a great contribution to proteomics research.
AB - Protein-protein interactions (PPIs) play an important role in most of the biological processes. How to correctly and efficiently detect protein interaction is a problem that is worth studying. Although high-throughput technologies provide the possibility to detect large-scale PPIs, these cannot be used to detect whole PPIs, and unreliable data may be generated. To solve this problem, in this study, a novel computational method was proposed to effectively predict the PPIs using the information of a protein sequence. The present method adopts Zernike moments to extract the protein sequence feature from a position specific scoring matrix (PSSM). Then, these extracted features were reconstructed using the stacked autoencoder. Finally, a novel probabilistic classification vector machine (PCVM) classifier was employed to predict the protein-protein interactions. When performed on the PPIs datasets of Yeast and H. pylori, the proposed method could achieve average accuracies of 96.60% and 91.19%, respectively. The promising result shows that the proposed method has a better ability to detect PPIs than other detection methods. The proposed method was also applied to predict PPIs on other species, and promising results were obtained. To evaluate the ability of our method, we compared it with the-state-of-the-art support vector machine (SVM) classifier for the Yeast dataset. The results obtained via multiple experiments prove that our method is powerful, efficient, feasible, and make a great contribution to proteomics research.
UR - http://www.scopus.com/inward/record.url?scp=85021626613&partnerID=8YFLogxK
U2 - 10.1039/c7mb00188f
DO - 10.1039/c7mb00188f
M3 - 文章
AN - SCOPUS:85021626613
SN - 1742-206X
VL - 13
SP - 1336
EP - 1344
JO - Molecular BioSystems
JF - Molecular BioSystems
IS - 7
ER -