TY - GEN
T1 - Predicting protein-protein interactions from multimodal biological data sources via nonnegative matrix tri-factorization
AU - Wang, Hua
AU - Huang, Heng
AU - Ding, Chris
AU - Nie, Feiping
PY - 2012
Y1 - 2012
N2 - Due to the high false positive rate in the high-throughput experimental methods to discover protein interactions, computational methods are necessary and crucial to complete the interactome expeditiously. However, when building classification models to identify putative protein interactions, compared to the obvious choice of positive samples from truly interacting protein pairs, it is usually very hard to select negative samples, because non-interacting protein pairs refer to those currently without experimental or computational evidence to support a physical interaction or a functional association, which, though, could interact in reality. To tackle this difficulty, instead of using heuristics as in many existing works, in this paper we solve it in a principled way by formulating the protein interaction prediction problem from a new mathematical perspective of view - sparse matrix completion, and propose a novel Nonnegative Matrix Tri-Factorization (NMTF) based matrix completion approach to predict new protein interactions from existing protein interaction networks. Because matrix completion only requires positive samples but not use negative samples, the challenge in existing classification based methods for protein interaction prediction is circumvented. Through using manifold regularization, we further develop our method to integrate different biological data sources, such as protein sequences, gene expressions, protein structure information, etc. Extensive experimental results on Saccharomyces cerevisiae genome show that our new methods outperform related state-of-the-art protein interaction prediction methods.
AB - Due to the high false positive rate in the high-throughput experimental methods to discover protein interactions, computational methods are necessary and crucial to complete the interactome expeditiously. However, when building classification models to identify putative protein interactions, compared to the obvious choice of positive samples from truly interacting protein pairs, it is usually very hard to select negative samples, because non-interacting protein pairs refer to those currently without experimental or computational evidence to support a physical interaction or a functional association, which, though, could interact in reality. To tackle this difficulty, instead of using heuristics as in many existing works, in this paper we solve it in a principled way by formulating the protein interaction prediction problem from a new mathematical perspective of view - sparse matrix completion, and propose a novel Nonnegative Matrix Tri-Factorization (NMTF) based matrix completion approach to predict new protein interactions from existing protein interaction networks. Because matrix completion only requires positive samples but not use negative samples, the challenge in existing classification based methods for protein interaction prediction is circumvented. Through using manifold regularization, we further develop our method to integrate different biological data sources, such as protein sequences, gene expressions, protein structure information, etc. Extensive experimental results on Saccharomyces cerevisiae genome show that our new methods outperform related state-of-the-art protein interaction prediction methods.
KW - Multimodal Biological Data
KW - Nonnegative Matrix Factorization
KW - Protein-Protein Interaction
UR - http://www.scopus.com/inward/record.url?scp=84860831442&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-29627-7_33
DO - 10.1007/978-3-642-29627-7_33
M3 - 会议稿件
AN - SCOPUS:84860831442
SN - 9783642296260
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 314
EP - 325
BT - Research in Computational Molecular Biology - 16th Annual International Conference, RECOMB 2012, Proceedings
T2 - 16th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2012
Y2 - 21 April 2012 through 24 April 2012
ER -