TY - JOUR
T1 - Using pseudo amino acid composition to predict protein subcellular location
T2 - Approached with amino acid composition distribution
AU - Shi, J. Y.
AU - Zhang, S. W.
AU - Pan, Q.
AU - Zhou, G. P.
PY - 2008/8
Y1 - 2008/8
N2 - In the Post Genome Age, there is an urgent need to develop the reliable and effective computational methods to predict the subcellular localization for the explosion of newly found proteins. Here, a novel method of pseudo amino acid (PseAA) composition, the so-called "amino acid composition distribution" (AACD), is introduced. First, a protein sequence is divided equally into multiple segments. Then, amino acid composition of each segment is calculated in series. After that, each protein sequence can be represented by a feature vector. Finally, the feature vectors of all sequences thus obtained are further input into the multi-class support vector machines to predict the subcellular localization. The results show that AACD is quite effective in representing protein sequences for the purpose of predicting protein subcellular localization.
AB - In the Post Genome Age, there is an urgent need to develop the reliable and effective computational methods to predict the subcellular localization for the explosion of newly found proteins. Here, a novel method of pseudo amino acid (PseAA) composition, the so-called "amino acid composition distribution" (AACD), is introduced. First, a protein sequence is divided equally into multiple segments. Then, amino acid composition of each segment is calculated in series. After that, each protein sequence can be represented by a feature vector. Finally, the feature vectors of all sequences thus obtained are further input into the multi-class support vector machines to predict the subcellular localization. The results show that AACD is quite effective in representing protein sequences for the purpose of predicting protein subcellular localization.
KW - Amino acid composition distribution
KW - Protein subcellular localization
KW - Pseudo amino acid composition
KW - Support vector machines
UR - http://www.scopus.com/inward/record.url?scp=49749119209&partnerID=8YFLogxK
U2 - 10.1007/s00726-007-0623-z
DO - 10.1007/s00726-007-0623-z
M3 - 文章
C2 - 18209947
AN - SCOPUS:49749119209
SN - 0939-4451
VL - 35
SP - 321
EP - 327
JO - Amino Acids
JF - Amino Acids
IS - 2
ER -