A hierarchical GNN across semantic and topological domains for predicting circRNA-microRNA interactions

Jiren Zhou; Boya Ji; Rui Niu; Xuequn Shang; Zhuhong You

doi:10.1016/j.knosys.2024.112549

A hierarchical GNN across semantic and topological domains for predicting circRNA-microRNA interactions

Jiren Zhou, Boya Ji, Rui Niu, Xuequn Shang, Zhuhong You

计算机学院

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Identifying circRNA-microRNA interactions (CMI) is a significant biomedical issue in recent years. This problem provides insights into using circRNA as biomarkers, developing cancer therapies and producing cancer vaccines. Using computational methods for identification is a more time-efficient and cost-effective approach. In computational methods, using graphs to represent and explore the CMI is a mainstream approach. However, existing relevant methods do not achieve optimal results by utilizing both the semantic information extracted from sequences and the topological information extracted from graph structures. To address this issue, we propose HGLMALLM, a graph contrastive learning method that learns node representation crossing both the semantic domain generated via motif-aware pre-trained LLMs and the topological domain extracted from hierarchical graph structures. Our method effectively addresses the issue in existing Message Passing Neural Network (MPNN) method that edge components losing heterogeneity after multiple iterations. Moreover, this method utilizes the heterogeneity of graph which is extended from the traditional bipartite graph to heterogeneous through the semantic domain. Two commonly used datasets were partitioned based on the distribution of node degrees. Then, we benchmarked our method against existing methods. In the independent testing set evaluation, it achieved a 3 % and 1 % improvement on two datasets. Our method demonstrated the best stability in ten-fold cross-validation on the training set. A test conducted on the peripheral components reveals robust performance of our model. A dataset collected from real scenarios was used to demonstrate the strong predictive ability of our method for identifying unidentified CMI.

源语言	英语
文章编号	112549
期刊	Knowledge-Based Systems
卷	304
DOI	https://doi.org/10.1016/j.knosys.2024.112549
出版状态	已出版 - 25 11月 2024

联合国可持续发展目标

此成果有助于实现下列可持续发展目标：

访问文件

10.1016/j.knosys.2024.112549

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{aaa03a3c17d1444d960901f4f590dca2,

title = "A hierarchical GNN across semantic and topological domains for predicting circRNA-microRNA interactions",

abstract = "Identifying circRNA-microRNA interactions (CMI) is a significant biomedical issue in recent years. This problem provides insights into using circRNA as biomarkers, developing cancer therapies and producing cancer vaccines. Using computational methods for identification is a more time-efficient and cost-effective approach. In computational methods, using graphs to represent and explore the CMI is a mainstream approach. However, existing relevant methods do not achieve optimal results by utilizing both the semantic information extracted from sequences and the topological information extracted from graph structures. To address this issue, we propose HGLMALLM, a graph contrastive learning method that learns node representation crossing both the semantic domain generated via motif-aware pre-trained LLMs and the topological domain extracted from hierarchical graph structures. Our method effectively addresses the issue in existing Message Passing Neural Network (MPNN) method that edge components losing heterogeneity after multiple iterations. Moreover, this method utilizes the heterogeneity of graph which is extended from the traditional bipartite graph to heterogeneous through the semantic domain. Two commonly used datasets were partitioned based on the distribution of node degrees. Then, we benchmarked our method against existing methods. In the independent testing set evaluation, it achieved a 3 % and 1 % improvement on two datasets. Our method demonstrated the best stability in ten-fold cross-validation on the training set. A test conducted on the peripheral components reveals robust performance of our model. A dataset collected from real scenarios was used to demonstrate the strong predictive ability of our method for identifying unidentified CMI.",

keywords = "CircRNA-microRNA interactions, Cross-domain contrastive learning, Graph neural network, Mutual information maximization",

author = "Jiren Zhou and Boya Ji and Rui Niu and Xuequn Shang and Zhuhong You",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier B.V.",

year = "2024",

month = nov,

day = "25",

doi = "10.1016/j.knosys.2024.112549",

language = "英语",

volume = "304",

journal = "Knowledge-Based Systems",

issn = "0950-7051",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - A hierarchical GNN across semantic and topological domains for predicting circRNA-microRNA interactions

AU - Zhou, Jiren

AU - Ji, Boya

AU - Niu, Rui

AU - Shang, Xuequn

AU - You, Zhuhong

PY - 2024/11/25

Y1 - 2024/11/25

N2 - Identifying circRNA-microRNA interactions (CMI) is a significant biomedical issue in recent years. This problem provides insights into using circRNA as biomarkers, developing cancer therapies and producing cancer vaccines. Using computational methods for identification is a more time-efficient and cost-effective approach. In computational methods, using graphs to represent and explore the CMI is a mainstream approach. However, existing relevant methods do not achieve optimal results by utilizing both the semantic information extracted from sequences and the topological information extracted from graph structures. To address this issue, we propose HGLMALLM, a graph contrastive learning method that learns node representation crossing both the semantic domain generated via motif-aware pre-trained LLMs and the topological domain extracted from hierarchical graph structures. Our method effectively addresses the issue in existing Message Passing Neural Network (MPNN) method that edge components losing heterogeneity after multiple iterations. Moreover, this method utilizes the heterogeneity of graph which is extended from the traditional bipartite graph to heterogeneous through the semantic domain. Two commonly used datasets were partitioned based on the distribution of node degrees. Then, we benchmarked our method against existing methods. In the independent testing set evaluation, it achieved a 3 % and 1 % improvement on two datasets. Our method demonstrated the best stability in ten-fold cross-validation on the training set. A test conducted on the peripheral components reveals robust performance of our model. A dataset collected from real scenarios was used to demonstrate the strong predictive ability of our method for identifying unidentified CMI.

AB - Identifying circRNA-microRNA interactions (CMI) is a significant biomedical issue in recent years. This problem provides insights into using circRNA as biomarkers, developing cancer therapies and producing cancer vaccines. Using computational methods for identification is a more time-efficient and cost-effective approach. In computational methods, using graphs to represent and explore the CMI is a mainstream approach. However, existing relevant methods do not achieve optimal results by utilizing both the semantic information extracted from sequences and the topological information extracted from graph structures. To address this issue, we propose HGLMALLM, a graph contrastive learning method that learns node representation crossing both the semantic domain generated via motif-aware pre-trained LLMs and the topological domain extracted from hierarchical graph structures. Our method effectively addresses the issue in existing Message Passing Neural Network (MPNN) method that edge components losing heterogeneity after multiple iterations. Moreover, this method utilizes the heterogeneity of graph which is extended from the traditional bipartite graph to heterogeneous through the semantic domain. Two commonly used datasets were partitioned based on the distribution of node degrees. Then, we benchmarked our method against existing methods. In the independent testing set evaluation, it achieved a 3 % and 1 % improvement on two datasets. Our method demonstrated the best stability in ten-fold cross-validation on the training set. A test conducted on the peripheral components reveals robust performance of our model. A dataset collected from real scenarios was used to demonstrate the strong predictive ability of our method for identifying unidentified CMI.

KW - CircRNA-microRNA interactions

KW - Cross-domain contrastive learning

KW - Graph neural network

KW - Mutual information maximization

UR - http://www.scopus.com/inward/record.url?scp=85205313203&partnerID=8YFLogxK

U2 - 10.1016/j.knosys.2024.112549

DO - 10.1016/j.knosys.2024.112549

M3 - 文章

AN - SCOPUS:85205313203

SN - 0950-7051

VL - 304

JO - Knowledge-Based Systems

JF - Knowledge-Based Systems

M1 - 112549

ER -

A hierarchical GNN across semantic and topological domains for predicting circRNA-microRNA interactions

摘要

联合国可持续发展目标

访问文件

其它文件与链接

指纹

引用此