TY - JOUR
T1 - An efficient circRNA-miRNA interaction prediction model by combining biological text mining and wavelet diffusion-based sparse network structure embedding
AU - Wang, Xin Fei
AU - Yu, Chang Qing
AU - You, Zhu Hong
AU - Qiao, Yan
AU - Li, Zheng Wei
AU - Huang, Wen Zhun
N1 - Publisher Copyright:
© 2023 Elsevier Ltd
PY - 2023/10
Y1 - 2023/10
N2 - Motivation: Accumulating clinical evidence shows that circular RNA (circRNA) plays an important regulatory role in the occurrence and development of human diseases, which is expected to provide a new perspective for the diagnosis and treatment of related diseases. Using computational methods can provide high probability preselection for wet experiments to save resources. However, due to the lack of neighborhood structure in sparse biological networks, the model based on network embedding and graph embedding is difficult to achieve ideal results. Results: In this paper, we propose BioDGW-CMI, which combines biological text mining and wavelet diffusion-based sparse network structure embedding to predict circRNA-miRNA interaction (CMI). In detail, BioDGW-CMI first uses the Bidirectional Encoder Representations from Transformers (BERT) for biological text mining to mine hidden features in RNA sequences, then constructs a CMI network, obtains the topological structure embedding of nodes in the network through heat wavelet diffusion patterns. Next, the Denoising autoencoder organically combines the structural features and Gaussian kernel similarity, finally, the feature is sent to lightGBM for training and prediction. BioDGW-CMI achieves the highest prediction performance in all three datasets in the field of CMI prediction. In the case study, all the 8 pairs of CMI based on circ-ITCH were successfully predicted. Availability: The data and source code can be found at https://github.com/1axin/BioDGW-CMI-model.
AB - Motivation: Accumulating clinical evidence shows that circular RNA (circRNA) plays an important regulatory role in the occurrence and development of human diseases, which is expected to provide a new perspective for the diagnosis and treatment of related diseases. Using computational methods can provide high probability preselection for wet experiments to save resources. However, due to the lack of neighborhood structure in sparse biological networks, the model based on network embedding and graph embedding is difficult to achieve ideal results. Results: In this paper, we propose BioDGW-CMI, which combines biological text mining and wavelet diffusion-based sparse network structure embedding to predict circRNA-miRNA interaction (CMI). In detail, BioDGW-CMI first uses the Bidirectional Encoder Representations from Transformers (BERT) for biological text mining to mine hidden features in RNA sequences, then constructs a CMI network, obtains the topological structure embedding of nodes in the network through heat wavelet diffusion patterns. Next, the Denoising autoencoder organically combines the structural features and Gaussian kernel similarity, finally, the feature is sent to lightGBM for training and prediction. BioDGW-CMI achieves the highest prediction performance in all three datasets in the field of CMI prediction. In the case study, all the 8 pairs of CMI based on circ-ITCH were successfully predicted. Availability: The data and source code can be found at https://github.com/1axin/BioDGW-CMI-model.
KW - Biological text mining
KW - Biomarkers
KW - circRNA-miRNA interaction
KW - Structural role discovery
KW - Structure embedding
UR - http://www.scopus.com/inward/record.url?scp=85169912159&partnerID=8YFLogxK
U2 - 10.1016/j.compbiomed.2023.107421
DO - 10.1016/j.compbiomed.2023.107421
M3 - 文章
C2 - 37672925
AN - SCOPUS:85169912159
SN - 0010-4825
VL - 165
JO - Computers in Biology and Medicine
JF - Computers in Biology and Medicine
M1 - 107421
ER -