TY - GEN
T1 - Predicting circRNA-disease associations using similarity assessing graph convolution from multi-source information networks
AU - Li, Yang
AU - Hu, Xue Gang
AU - Li, Pei Pei
AU - Wang, Lei
AU - You, Zhu Hong
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Circular RNA (circRNA), a novel endogenous noncoding RNA molecule with a closed-loop structure, can be used as a biomarker for many complex human diseases. Determining the relationship between circRNAs and diseases helps us to understand the diagnosis, treatment, and pathogenesis of complex diseases, which plays a critical role in clinical research. Nevertheless, the discovery of new circRNA-disease associations by wet-lab methods is not only time-consuming and costly but also randomized and blinded, which is also limited to small-scale studies. Thus, there is an urgent need to establish efficient and reliable computational methods to infer potential circRNA-disease associations on a large scale to effectively reduce costs and save time, and avoid high false-positive rates. In this paper, we propose a novel computational method for predicting circRNA-disease association based on the Similarity Assessing Graph Convolution Network (SAGCN) algorithm, which combines the multi-source similarity network constructed by circRNA and disease. Firstly, we fuse the multi-source similarity information of circRNAs and diseases and construct the multi-source similarity network respectively. Then we use the SAGCN algorithm to extract the hidden feature representations of circRNAs and diseases efficiently and objectively in the way of measuring the similarity between different nodes in the network. Finally, the obtained high-level features of circRNAs and diseases are fed to the multilayer perceptron (MLP) classifier for accurate prediction. Using the 5-fold cross-validation method, the AUC scores of the four SAGCN algorithms, on the benchmark circR2Disease dataset are 93.30%, 92.98%, 92.22% and 91.94%, respectively. Furthermore, case studies further validated that the proposed model was supported by biological experiments, and 25 of the top 30 circRNA-disease associations with the highest scores were confirmed by recent literature. Based on these reliable results, it can be anticipated that the proposed model can be used as an effective computational tool to predict circRNA-disease associations and can provide the most promising candidates for biological experiments.
AB - Circular RNA (circRNA), a novel endogenous noncoding RNA molecule with a closed-loop structure, can be used as a biomarker for many complex human diseases. Determining the relationship between circRNAs and diseases helps us to understand the diagnosis, treatment, and pathogenesis of complex diseases, which plays a critical role in clinical research. Nevertheless, the discovery of new circRNA-disease associations by wet-lab methods is not only time-consuming and costly but also randomized and blinded, which is also limited to small-scale studies. Thus, there is an urgent need to establish efficient and reliable computational methods to infer potential circRNA-disease associations on a large scale to effectively reduce costs and save time, and avoid high false-positive rates. In this paper, we propose a novel computational method for predicting circRNA-disease association based on the Similarity Assessing Graph Convolution Network (SAGCN) algorithm, which combines the multi-source similarity network constructed by circRNA and disease. Firstly, we fuse the multi-source similarity information of circRNAs and diseases and construct the multi-source similarity network respectively. Then we use the SAGCN algorithm to extract the hidden feature representations of circRNAs and diseases efficiently and objectively in the way of measuring the similarity between different nodes in the network. Finally, the obtained high-level features of circRNAs and diseases are fed to the multilayer perceptron (MLP) classifier for accurate prediction. Using the 5-fold cross-validation method, the AUC scores of the four SAGCN algorithms, on the benchmark circR2Disease dataset are 93.30%, 92.98%, 92.22% and 91.94%, respectively. Furthermore, case studies further validated that the proposed model was supported by biological experiments, and 25 of the top 30 circRNA-disease associations with the highest scores were confirmed by recent literature. Based on these reliable results, it can be anticipated that the proposed model can be used as an effective computational tool to predict circRNA-disease associations and can provide the most promising candidates for biological experiments.
KW - circRNA
KW - circRNA-disease association
KW - deep learning
KW - graph convelutional network
KW - multilayer perceptron
KW - similarity assessment
UR - http://www.scopus.com/inward/record.url?scp=85146651528&partnerID=8YFLogxK
U2 - 10.1109/BIBM55620.2022.9995674
DO - 10.1109/BIBM55620.2022.9995674
M3 - 会议稿件
AN - SCOPUS:85146651528
T3 - Proceedings - 2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022
SP - 94
EP - 101
BT - Proceedings - 2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022
A2 - Adjeroh, Donald
A2 - Long, Qi
A2 - Shi, Xinghua
A2 - Guo, Fei
A2 - Hu, Xiaohua
A2 - Aluru, Srinivas
A2 - Narasimhan, Giri
A2 - Wang, Jianxin
A2 - Kang, Mingon
A2 - Mondal, Ananda M.
A2 - Liu, Jin
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022
Y2 - 6 December 2022 through 8 December 2022
ER -