TY - GEN
T1 - Precise Prediction of Pathogenic Microorganisms Using 16S rRNA Gene Sequences
AU - Huang, Yu An
AU - Huang, Zhi An
AU - You, Zhu Hong
AU - Hu, Pengwei
AU - Li, Li Ping
AU - Li, Zheng Wei
AU - Wang, Lei
N1 - Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019
Y1 - 2019
N2 - Clinical observations show that human microorganisms get involved in various human biological processes. The disruption of a symbiotic balance for host-microbiota relationship is found to cause different types of human complex diseases. Discoverying the associations between microbes and the host health statuses that they affect could provide great insights into understanding the mechanisms of diseases caused by microbes. However, experimental approaches are time-consuming and expensive. Little effort has been done to develop computational models for predicting pathogenic microbes on a large scale. The prediction results yielded by such models are anticipated to boost the identification and characterization of potential human pathogenic microbes. Based on the assumption that microbes of similar characters tend to get involved in diseases of similar symptoms forming functional clusters, in this paper, we develop a group based computational model of Bayesian disease-oriented ranking for inferring the most potential microbes associated with human diseases. It is the first attempt to predict this kind of associations by using 16S rRNA gene sequences. Based on the sequence information of genes, we use two computational approaches (BLAST+ and MEGA 7) to measure how similar each pairs of microbes are from different aspects. On the other hand, the similarity of diseases is computed based on MeSH descriptors. Using the data collected from HMDAD database, the proposed model achieved AUCs of 0.9456, 0.8266, 0.8866 and 0.8926 in leave-one-out, 2-fold, 5-fold and 10-fold cross validations, respectively. Besides, we conducted a case study on colorectal carcinoma and found that 16 out of top-20 predicted microbes can be confirmed by the published literatures. The prediction result is publicly released and anticipated to help researchers to preferentially validate these promising pathogenic microbe candidates via biological experiments.
AB - Clinical observations show that human microorganisms get involved in various human biological processes. The disruption of a symbiotic balance for host-microbiota relationship is found to cause different types of human complex diseases. Discoverying the associations between microbes and the host health statuses that they affect could provide great insights into understanding the mechanisms of diseases caused by microbes. However, experimental approaches are time-consuming and expensive. Little effort has been done to develop computational models for predicting pathogenic microbes on a large scale. The prediction results yielded by such models are anticipated to boost the identification and characterization of potential human pathogenic microbes. Based on the assumption that microbes of similar characters tend to get involved in diseases of similar symptoms forming functional clusters, in this paper, we develop a group based computational model of Bayesian disease-oriented ranking for inferring the most potential microbes associated with human diseases. It is the first attempt to predict this kind of associations by using 16S rRNA gene sequences. Based on the sequence information of genes, we use two computational approaches (BLAST+ and MEGA 7) to measure how similar each pairs of microbes are from different aspects. On the other hand, the similarity of diseases is computed based on MeSH descriptors. Using the data collected from HMDAD database, the proposed model achieved AUCs of 0.9456, 0.8266, 0.8866 and 0.8926 in leave-one-out, 2-fold, 5-fold and 10-fold cross validations, respectively. Besides, we conducted a case study on colorectal carcinoma and found that 16 out of top-20 predicted microbes can be confirmed by the published literatures. The prediction result is publicly released and anticipated to help researchers to preferentially validate these promising pathogenic microbe candidates via biological experiments.
KW - 16S rRNA sequence analysis
KW - Computational prediction model
KW - Microbe–disease associations
KW - Microflora
KW - Pathogenic microorganisms
UR - http://www.scopus.com/inward/record.url?scp=85070551766&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-26969-2_13
DO - 10.1007/978-3-030-26969-2_13
M3 - 会议稿件
AN - SCOPUS:85070551766
SN - 9783030269685
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 138
EP - 150
BT - Intelligent Computing Theories and Application - 15th International Conference, ICIC 2019, Proceedings
A2 - Huang, De-Shuang
A2 - Jo, Kang-Hyun
A2 - Huang, Zhi-Kai
PB - Springer Verlag
T2 - 15th International Conference on Intelligent Computing, ICIC 2019
Y2 - 3 August 2019 through 6 August 2019
ER -