TY - GEN
T1 - Prediction of protein-protein interaction using distance frequency of amino acids grouped with their physicochemical properties
AU - Zhang, Shao Wu
AU - Cheng, Yong Mei
AU - Luo, Li
AU - Pan, Quan
PY - 2011
Y1 - 2011
N2 - Protein-protein interactions (PPIs) play a key role in many cellular processes. These interactions form the basis of phenomena such as DNA replication and transcription, metabolic pathway, signaling pathway, and cell cycle control. Knowing how proteins interact with each other can help the biological scientist understand the molecular mechanism of the cell. Unfortunately, the experimental methods of identifying PPIs are both time-consuming and expensive. Therefore, developing computational approaches for predicting PPIs would be of significant value. Here, we propose a novel method for predicting the PPI using distance frequency of amino acids grouped with their physicochemical properties (hydrophobicity, normalized van der Waals volume, polarity and polarizability) and PCA. First, the 20 basic amino acids were divided into three groups according to the four kinds of physicochemical property values. Second, the distance frequency feature extraction method was introduced to represent the protein pairs, and also fused the feature vectors extracted with four physicochemical properties to form different feature vector sets. Third, the PCA method was used to reduce the vector dimension, and support vector machine was adopted as the classifier. The overall success rate of our method for hydrophobicity, normalized van der Waals volume, polarity and polarizability are 89.88%, 89.72%, 89.28% and 89.24% in 10CV test, which are 6.65%, 8.05%, 9.72% and 8.09% higher than that of Guo's auto-covariance function feature extraction method respectively. The total predicting accuracy of fusing the four physicochemical properties arrives at 91.79%. The results show that the current approach is very promising for predicting PPI, and may become a useful tool in the relevant areas.
AB - Protein-protein interactions (PPIs) play a key role in many cellular processes. These interactions form the basis of phenomena such as DNA replication and transcription, metabolic pathway, signaling pathway, and cell cycle control. Knowing how proteins interact with each other can help the biological scientist understand the molecular mechanism of the cell. Unfortunately, the experimental methods of identifying PPIs are both time-consuming and expensive. Therefore, developing computational approaches for predicting PPIs would be of significant value. Here, we propose a novel method for predicting the PPI using distance frequency of amino acids grouped with their physicochemical properties (hydrophobicity, normalized van der Waals volume, polarity and polarizability) and PCA. First, the 20 basic amino acids were divided into three groups according to the four kinds of physicochemical property values. Second, the distance frequency feature extraction method was introduced to represent the protein pairs, and also fused the feature vectors extracted with four physicochemical properties to form different feature vector sets. Third, the PCA method was used to reduce the vector dimension, and support vector machine was adopted as the classifier. The overall success rate of our method for hydrophobicity, normalized van der Waals volume, polarity and polarizability are 89.88%, 89.72%, 89.28% and 89.24% in 10CV test, which are 6.65%, 8.05%, 9.72% and 8.09% higher than that of Guo's auto-covariance function feature extraction method respectively. The total predicting accuracy of fusing the four physicochemical properties arrives at 91.79%. The results show that the current approach is very promising for predicting PPI, and may become a useful tool in the relevant areas.
KW - Distance frequency
KW - PCA
KW - Protein-protein interaction
KW - Support vector machine
UR - http://www.scopus.com/inward/record.url?scp=80155179833&partnerID=8YFLogxK
U2 - 10.1109/BIC-TA.2011.53
DO - 10.1109/BIC-TA.2011.53
M3 - 会议稿件
AN - SCOPUS:80155179833
SN - 9780769545141
T3 - Proceedings - 2011 6th International Conference on Bio-Inspired Computing: Theories and Applications, BIC-TA 2011
SP - 70
EP - 74
BT - Proceedings - 2011 6th International Conference on Bio-Inspired Computing
T2 - 6th International Conference on Bio-Inspired Computing: Theories and Applications, BIC-TA 2011
Y2 - 27 September 2011 through 29 September 2011
ER -