TY - GEN
T1 - Predicting Hepatoma-Related Genes Based on Representation Learning of PPI network and Gene Ontology Annotations
AU - Wang, Tao
AU - Shao, Zhiyuan
AU - Xiao, Yifu
AU - Zhang, Xuchao
AU - Chen, Yitian
AU - Shi, Binze
AU - Chen, Siyu
AU - Wang, Yuxian
AU - Peng, Jiajie
AU - Shang, Xuequn
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Hepatoma is the most common type of primary liver cancer with a high mortality rate in the world. The genetic causes of the disease pathology remain largely unknown. Effective discovery of the genes associated with hepatoma has become important in disease prevention, early diagnosis, and therapeutic treatments. With the developments of molecular networks, graph-based methods have been tremendously successful in predicting disease genes based on the hypothesis of guilt-by-association. Network representation learning (NRL) techniques have accelerated disease gene discovery in recent years because of their powerful network feature extraction ability. However, the current network representation learning-based methods for disease gene discovery did not consider the gene features derived from gene ontology annotations, which apriori group genes with similar functions. To fill this gap, here we propose a novel framework to predict hepatoma-related genes based on representation learning from both protein-protein interactions (PPI) network and gene ontology annotations. Our framework has three steps: learning features from PPI network and gene ontologies using NRL techniques, integrating different features based on autoencoder, predicting hepatoma-related genes using machine learning classifiers. Experiments have demonstrated that our framework could accurately predict hepatoma-related genes with AUROC and AUPRC reaching 0.93 and 0.94, respectively. Compared with other methods using only single representation features, our framework also shows superior performance on hepatoma gene prediction.
AB - Hepatoma is the most common type of primary liver cancer with a high mortality rate in the world. The genetic causes of the disease pathology remain largely unknown. Effective discovery of the genes associated with hepatoma has become important in disease prevention, early diagnosis, and therapeutic treatments. With the developments of molecular networks, graph-based methods have been tremendously successful in predicting disease genes based on the hypothesis of guilt-by-association. Network representation learning (NRL) techniques have accelerated disease gene discovery in recent years because of their powerful network feature extraction ability. However, the current network representation learning-based methods for disease gene discovery did not consider the gene features derived from gene ontology annotations, which apriori group genes with similar functions. To fill this gap, here we propose a novel framework to predict hepatoma-related genes based on representation learning from both protein-protein interactions (PPI) network and gene ontology annotations. Our framework has three steps: learning features from PPI network and gene ontologies using NRL techniques, integrating different features based on autoencoder, predicting hepatoma-related genes using machine learning classifiers. Experiments have demonstrated that our framework could accurately predict hepatoma-related genes with AUROC and AUPRC reaching 0.93 and 0.94, respectively. Compared with other methods using only single representation features, our framework also shows superior performance on hepatoma gene prediction.
KW - disease gene prediction
KW - gene ontology
KW - Hepatoma
KW - network representation learning
KW - PPI network
UR - http://www.scopus.com/inward/record.url?scp=85125187361&partnerID=8YFLogxK
U2 - 10.1109/BIBM52615.2021.9669479
DO - 10.1109/BIBM52615.2021.9669479
M3 - 会议稿件
AN - SCOPUS:85125187361
T3 - Proceedings - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
SP - 1892
EP - 1898
BT - Proceedings - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
A2 - Huang, Yufei
A2 - Kurgan, Lukasz
A2 - Luo, Feng
A2 - Hu, Xiaohua Tony
A2 - Chen, Yidong
A2 - Dougherty, Edward
A2 - Kloczkowski, Andrzej
A2 - Li, Yaohang
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
Y2 - 9 December 2021 through 12 December 2021
ER -