TY - JOUR
T1 - Using discriminative vector machine model with 2DPCA to predict interactions among proteins
AU - Li, Zhengwei
AU - Nie, Ru
AU - You, Zhuhong
AU - Cao, Chen
AU - Li, Jiashu
N1 - Publisher Copyright:
© 2019 The Author(s).
PY - 2019/12/24
Y1 - 2019/12/24
N2 - Background: The interactions among proteins act as crucial roles in most cellular processes. Despite enormous effort put for identifying protein-protein interactions (PPIs) from a large number of organisms, existing firsthand biological experimental methods are high cost, low efficiency, and high false-positive rate. The application of in silico methods opens new doors for predicting interactions among proteins, and has been attracted a great deal of attention in the last decades. Results: Here we present a novelty computational model with the adoption of our proposed Discriminative Vector Machine (DVM) model and a 2-Dimensional Principal Component Analysis (2DPCA) descriptor to identify candidate PPIs only based on protein sequences. To be more specific, a 2DPCA descriptor is employed to capture discriminative feature information from Position-Specific Scoring Matrix (PSSM) of amino acid sequences by the tool of PSI-BLAST. Then, a robust and powerful DVM classifier is employed to infer PPIs. When applied on both gold benchmark datasets of Yeast and H. pylori, our model obtained mean prediction accuracies as high as of 97.06 and 92.89%, respectively, which demonstrates a noticeable improvement than some state-of-the-art methods. Moreover, we constructed Support Vector Machines (SVM) based predictive model and made comparison it with our model on Human benchmark dataset. In addition, to further demonstrate the predictive reliability of our proposed method, we also carried out extensive experiments for identifying cross-species PPIs on five other species datasets. Conclusions: All the experimental results indicate that our method is very effective for identifying potential PPIs and could serve as a practical approach to aid bioexperiment in proteomics research.
AB - Background: The interactions among proteins act as crucial roles in most cellular processes. Despite enormous effort put for identifying protein-protein interactions (PPIs) from a large number of organisms, existing firsthand biological experimental methods are high cost, low efficiency, and high false-positive rate. The application of in silico methods opens new doors for predicting interactions among proteins, and has been attracted a great deal of attention in the last decades. Results: Here we present a novelty computational model with the adoption of our proposed Discriminative Vector Machine (DVM) model and a 2-Dimensional Principal Component Analysis (2DPCA) descriptor to identify candidate PPIs only based on protein sequences. To be more specific, a 2DPCA descriptor is employed to capture discriminative feature information from Position-Specific Scoring Matrix (PSSM) of amino acid sequences by the tool of PSI-BLAST. Then, a robust and powerful DVM classifier is employed to infer PPIs. When applied on both gold benchmark datasets of Yeast and H. pylori, our model obtained mean prediction accuracies as high as of 97.06 and 92.89%, respectively, which demonstrates a noticeable improvement than some state-of-the-art methods. Moreover, we constructed Support Vector Machines (SVM) based predictive model and made comparison it with our model on Human benchmark dataset. In addition, to further demonstrate the predictive reliability of our proposed method, we also carried out extensive experiments for identifying cross-species PPIs on five other species datasets. Conclusions: All the experimental results indicate that our method is very effective for identifying potential PPIs and could serve as a practical approach to aid bioexperiment in proteomics research.
UR - http://www.scopus.com/inward/record.url?scp=85077119658&partnerID=8YFLogxK
U2 - 10.1186/s12859-019-3268-5
DO - 10.1186/s12859-019-3268-5
M3 - 文章
C2 - 31874626
AN - SCOPUS:85077119658
SN - 1471-2105
VL - 20
JO - BMC Bioinformatics
JF - BMC Bioinformatics
M1 - 694
ER -