TY - GEN
T1 - Predicting Human Disease-Associated piRNAs Based on Multi-source Information and Random Forest
AU - Zheng, Kai
AU - You, Zhu Hong
AU - Wang, Lei
AU - Li, Hao Yuan
AU - Ji, Bo Ya
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - Whole genome analysis studies have shown that Piwi-interacting RNA (piRNA) play a crucial role in disease progression, diagnosis, and therapeutic target. However, traditional biological experiments are expensive and time-consuming. Thus, computational models could serve as a complementary means to provide potential disease-related piRNA candidates. In this study, we propose a novel computational model called APDA to identify piRNA-disease associations. The proposed method integrates disease semantic similarity and piRNA sequence information to construct feature vectors, and maps them to the optimal feature subspace through the stacked autoencoder to obtain the final feature vector. Finally, random forest classifier is used to infer disease-related piRNA. In five-fold cross-validation, the APDA achieved an average AUC of 0.9088 and standard deviation of 0.0126, which is significantly better than the compared method. Therefore, the proposed APDA method is a powerful and necessary tool for predicting human disease-associated piRNAs and provide new impetus to reveal the underlying causes of human disease.
AB - Whole genome analysis studies have shown that Piwi-interacting RNA (piRNA) play a crucial role in disease progression, diagnosis, and therapeutic target. However, traditional biological experiments are expensive and time-consuming. Thus, computational models could serve as a complementary means to provide potential disease-related piRNA candidates. In this study, we propose a novel computational model called APDA to identify piRNA-disease associations. The proposed method integrates disease semantic similarity and piRNA sequence information to construct feature vectors, and maps them to the optimal feature subspace through the stacked autoencoder to obtain the final feature vector. Finally, random forest classifier is used to infer disease-related piRNA. In five-fold cross-validation, the APDA achieved an average AUC of 0.9088 and standard deviation of 0.0126, which is significantly better than the compared method. Therefore, the proposed APDA method is a powerful and necessary tool for predicting human disease-associated piRNAs and provide new impetus to reveal the underlying causes of human disease.
KW - Disease
KW - Heterogenous information
KW - Multi-source information
KW - piRNA-disease associations
KW - PIWI-interacting RNA
UR - http://www.scopus.com/inward/record.url?scp=85094126553&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-60802-6_20
DO - 10.1007/978-3-030-60802-6_20
M3 - 会议稿件
AN - SCOPUS:85094126553
SN - 9783030608019
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 227
EP - 238
BT - Intelligent Computing Theories and Application - 16th International Conference, ICIC 2020, Proceedings
A2 - Huang, De-Shuang
A2 - Jo, Kang-Hyun
PB - Springer Science and Business Media Deutschland GmbH
T2 - 16th International Conference on Intelligent Computing, ICIC 2020
Y2 - 2 October 2020 through 5 October 2020
ER -