TY - JOUR
T1 - GAE-LGA
T2 - integration of multi-omics data with graph autoencoders to identify lncRNA–PCG associations
AU - Gao, Meihong
AU - Liu, Shuhui
AU - Qi, Yang
AU - Guo, Xinpeng
AU - Shang, Xuequn
N1 - Publisher Copyright:
© The Author(s) 2022. Published by Oxford University Press. All rights reserved.
PY - 2022/11/1
Y1 - 2022/11/1
N2 - Long non-coding RNAs (lncRNAs) can disrupt the biological functions of protein-coding genes (PCGs) to cause cancer. However, the relationship between lncRNAs and PCGs remains unclear and difficult to predict. Machine learning has achieved a satisfactory performance in association prediction, but to our knowledge, it is currently less used in lncRNA–PCG association prediction. Therefore, we introduce GAE-LGA, a powerful deep learning model with graph autoencoders as components, to recognize potential lncRNA–PCG associations. GAE-LGA jointly explored lncRNA–PCG learning and cross-omics correlation learning for effective lncRNA–PCG association identification. The functional similarity and multi-omics similarity of lncRNAs and PCGs were accumulated and encoded by graph autoencoders to extract feature representations of lncRNAs and PCGs, which were subsequently used for decoding to obtain candidate lncRNA–PCG pairs. Comprehensive evaluation demonstrated that GAE-LGA can successfully capture lncRNA–PCG associations with strong robustness and outperformed other machine learning-based identification methods. Furthermore, multi-omics features were shown to improve the performance of lncRNA–PCG association identification. In conclusion, GAE-LGA can act as an efficient application for lncRNA–PCG association prediction with the following advantages: It fuses multi-omics information into the similarity network, making the feature representation more accurate; it can predict lncRNA–PCG associations for new lncRNAs and identify potential lncRNA–PCG associations with high accuracy.
AB - Long non-coding RNAs (lncRNAs) can disrupt the biological functions of protein-coding genes (PCGs) to cause cancer. However, the relationship between lncRNAs and PCGs remains unclear and difficult to predict. Machine learning has achieved a satisfactory performance in association prediction, but to our knowledge, it is currently less used in lncRNA–PCG association prediction. Therefore, we introduce GAE-LGA, a powerful deep learning model with graph autoencoders as components, to recognize potential lncRNA–PCG associations. GAE-LGA jointly explored lncRNA–PCG learning and cross-omics correlation learning for effective lncRNA–PCG association identification. The functional similarity and multi-omics similarity of lncRNAs and PCGs were accumulated and encoded by graph autoencoders to extract feature representations of lncRNAs and PCGs, which were subsequently used for decoding to obtain candidate lncRNA–PCG pairs. Comprehensive evaluation demonstrated that GAE-LGA can successfully capture lncRNA–PCG associations with strong robustness and outperformed other machine learning-based identification methods. Furthermore, multi-omics features were shown to improve the performance of lncRNA–PCG association identification. In conclusion, GAE-LGA can act as an efficient application for lncRNA–PCG association prediction with the following advantages: It fuses multi-omics information into the similarity network, making the feature representation more accurate; it can predict lncRNA–PCG associations for new lncRNAs and identify potential lncRNA–PCG associations with high accuracy.
KW - cross-omics correlation learning
KW - graph autoencoders
KW - lncRNA–PCG associations prediction
KW - long non-coding RNAs
KW - protein-coding genes
UR - http://www.scopus.com/inward/record.url?scp=85142403105&partnerID=8YFLogxK
U2 - 10.1093/bib/bbac452
DO - 10.1093/bib/bbac452
M3 - 文章
C2 - 36305456
AN - SCOPUS:85142403105
SN - 1467-5463
VL - 23
JO - Briefings in Bioinformatics
JF - Briefings in Bioinformatics
IS - 6
M1 - bbac452
ER -