Abstract
This study proposes a model named BioKG-CMI to predict CMIs based on a biological knowledge graph. Faced with limited data, we employ subcellular localization to generate negative samples that align more closely with biological logic. To mine semantic information in circRNA and miRNA sequences, we introduce the pre-trained model BERT to learn sequence feature representation. Guided by the hypothesis that adjacent molecules have similar functions, we calculate spatial proximity between nodes of the same class. The DisMult algorithm is applied to extract the potential logical rules of the knowledge graph and learn entity and relationship representations. Subsequently, the integration of multi-feature successfully addresses the challenge of expressing the complex biological knowledge graph and overcoming the limitation of single-feature inadequacy. Multiple comparative experiments and case studies demonstrate the robustness of the proposed model.
Original language | English |
---|---|
Article number | 189104 |
Journal | Science China Information Sciences |
Volume | 67 |
Issue number | 8 |
DOIs | |
State | Published - Aug 2024 |