Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique

Jie Pan, Rui Wang, Wenjing Liu, Li Wang, Zhuhong You, Yuechao Li, Zhemeng Duan, Qinghua Huang, Jie Feng, Yanmei Sun, Shiwei Wang

Research output: Contribution to journalArticlepeer-review

Abstract

Bacteriophages (phages) are increasingly viewed as a promising alternative for the treatment of antibiotic-resistant bacterial infections. However, the diversity of host ranges complicates the identification of target phages. Existing computational tools often fail to accurately identify phages across different bacterial species. In this study, we present GE-PHI, a machine-learning-based model for predicting phage-host interactions (PHIs) by integrating knowledge graph embedding algorithm with a large-scale protein language model. First, a phage-host heterogeneous association network (PHAN) was constructed that incorporated phage-phage and host-host similarity networks. Then, the multi-relational Poincaré graph embedding (MuRP) was used to extract topological patterns. Additionally, we employed the ESM-2 protein language model to capture evolutionary information from phage tail proteins and host-receptor-binding proteins. GE-PHI achieved a cross-validation area under the curve (AUC) of up to 0.9453 in silico and maintains this performance in case studies. This study provides insights into machine-learning-guided phage therapeutics and diagnostics in microbial engineering.

Original languageEnglish
Article number111647
JournaliScience
Volume28
Issue number1
DOIs
StatePublished - 17 Jan 2025

Keywords

  • Bacteriology
  • Machine learning
  • Microbiology
  • Virology

Fingerprint

Dive into the research topics of 'Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique'. Together they form a unique fingerprint.

Cite this