TY - JOUR
T1 - A hierarchical GNN across semantic and topological domains for predicting circRNA-microRNA interactions
AU - Zhou, Jiren
AU - Ji, Boya
AU - Niu, Rui
AU - Shang, Xuequn
AU - You, Zhuhong
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/11/25
Y1 - 2024/11/25
N2 - Identifying circRNA-microRNA interactions (CMI) is a significant biomedical issue in recent years. This problem provides insights into using circRNA as biomarkers, developing cancer therapies and producing cancer vaccines. Using computational methods for identification is a more time-efficient and cost-effective approach. In computational methods, using graphs to represent and explore the CMI is a mainstream approach. However, existing relevant methods do not achieve optimal results by utilizing both the semantic information extracted from sequences and the topological information extracted from graph structures. To address this issue, we propose HGLMALLM, a graph contrastive learning method that learns node representation crossing both the semantic domain generated via motif-aware pre-trained LLMs and the topological domain extracted from hierarchical graph structures. Our method effectively addresses the issue in existing Message Passing Neural Network (MPNN) method that edge components losing heterogeneity after multiple iterations. Moreover, this method utilizes the heterogeneity of graph which is extended from the traditional bipartite graph to heterogeneous through the semantic domain. Two commonly used datasets were partitioned based on the distribution of node degrees. Then, we benchmarked our method against existing methods. In the independent testing set evaluation, it achieved a 3 % and 1 % improvement on two datasets. Our method demonstrated the best stability in ten-fold cross-validation on the training set. A test conducted on the peripheral components reveals robust performance of our model. A dataset collected from real scenarios was used to demonstrate the strong predictive ability of our method for identifying unidentified CMI.
AB - Identifying circRNA-microRNA interactions (CMI) is a significant biomedical issue in recent years. This problem provides insights into using circRNA as biomarkers, developing cancer therapies and producing cancer vaccines. Using computational methods for identification is a more time-efficient and cost-effective approach. In computational methods, using graphs to represent and explore the CMI is a mainstream approach. However, existing relevant methods do not achieve optimal results by utilizing both the semantic information extracted from sequences and the topological information extracted from graph structures. To address this issue, we propose HGLMALLM, a graph contrastive learning method that learns node representation crossing both the semantic domain generated via motif-aware pre-trained LLMs and the topological domain extracted from hierarchical graph structures. Our method effectively addresses the issue in existing Message Passing Neural Network (MPNN) method that edge components losing heterogeneity after multiple iterations. Moreover, this method utilizes the heterogeneity of graph which is extended from the traditional bipartite graph to heterogeneous through the semantic domain. Two commonly used datasets were partitioned based on the distribution of node degrees. Then, we benchmarked our method against existing methods. In the independent testing set evaluation, it achieved a 3 % and 1 % improvement on two datasets. Our method demonstrated the best stability in ten-fold cross-validation on the training set. A test conducted on the peripheral components reveals robust performance of our model. A dataset collected from real scenarios was used to demonstrate the strong predictive ability of our method for identifying unidentified CMI.
KW - CircRNA-microRNA interactions
KW - Cross-domain contrastive learning
KW - Graph neural network
KW - Mutual information maximization
UR - http://www.scopus.com/inward/record.url?scp=85205313203&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2024.112549
DO - 10.1016/j.knosys.2024.112549
M3 - 文章
AN - SCOPUS:85205313203
SN - 0950-7051
VL - 304
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 112549
ER -