A new hybrid approach to predict subcellular localization by incorporating protein evolutionary conservation information

Shao Wu Zhang, Yun Long Zhang, Jun Hui Li, Hui Feng Yang, Yong Mei Cheng, Guo Ping Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

The rapidly increasing number of sequence entering into the genome databank has created the need for fully automated methods to analyze them. Knowing the cellular location of a protein is a key step towards understanding its function. The development in statistical prediction of protein attributes generally consists of two cores: one is to construct a training dataset and the other is to formulate a predictive algorithm. The latter can be further separated into two subcores: one is how to give a mathematical expression to effectively represent a protein and the other is how to find a powerful algorithm to accurately perform the prediction. Here, an improved evolutionary conservation algorithm was proposed to calculate per residue conservation score. Then, each protein can be represented as a feature vector created with multi-scale energy (MSE). In addition, the protein can be represented as other feature vectors based on amino acid composition (AAC), weighted auto-correlation function and Moment descriptor methods. Finally, a novel hybrid approach was developed by fusing the four kinds of feature classifiers through a product rule system to predict 12 subcellular locations. Compared with existing methods, this new approach provides better predictive performance. High success accuracies were obtained in both jackknife cross-validation test and independent dataset test, suggesting that introducing protein evolutionary information and the concept of fusing multifeatures classifiers are quite promising, and might also hold a great potential as a useful vehicle for the other areas of molecular biology.

Original languageEnglish
Title of host publicationLife System Modeling and Simulation - International Conference, LSMS 2007, Proceedings
PublisherSpringer Verlag
Pages172-179
Number of pages8
ISBN (Print)9783540747703
DOIs
StatePublished - 2007
Event2007 International Conference on Life System Modeling and Simulation, LSMS 2007 - Shanghai, China
Duration: 14 Sep 200717 Sep 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4689 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference2007 International Conference on Life System Modeling and Simulation, LSMS 2007
Country/TerritoryChina
CityShanghai
Period14/09/0717/09/07

Fingerprint

Dive into the research topics of 'A new hybrid approach to predict subcellular localization by incorporating protein evolutionary conservation information'. Together they form a unique fingerprint.

Cite this