TY - JOUR
T1 - BGFE
T2 - A deep learning model for ncRNA-protein interaction predictions based on improved sequence information
AU - Zhan, Zhao Hui
AU - Jia, Li Na
AU - Zhou, Yong
AU - Li, Li Ping
AU - Yi, Hai Cheng
N1 - Publisher Copyright:
© 2019 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2019/2/2
Y1 - 2019/2/2
N2 - The interactions between ncRNAs and proteins are critical for regulating various cellular processes in organisms, such as gene expression regulations. However, due to limitations, including financial and material consumptions in recent experimental methods for predicting ncRNA and protein interactions, it is essential to propose an innovative and practical approach with convincing performance of prediction accuracy. In this study, based on the protein sequences from a biological perspective, we put forward an effective deep learning method, named BGFE, to predict ncRNA and protein interactions. Protein sequences are represented by bi-gram probability feature extraction method from Position Specific Scoring Matrix (PSSM), and for ncRNA sequences, k-mers sparse matrices are employed to represent them. Furthermore, to extract hidden high-level feature information, a stacked auto-encoder network is employed with the stacked ensemble integration strategy. We evaluate the performance of the proposed method by using three datasets and a five-fold cross-validation after classifying the features through the random forest classifier. The experimental results clearly demonstrate the effectiveness and the prediction accuracy of our approach. In general, the proposed method is helpful for ncRNA and protein interacting predictions and it provides some serviceable guidance in future biological research.
AB - The interactions between ncRNAs and proteins are critical for regulating various cellular processes in organisms, such as gene expression regulations. However, due to limitations, including financial and material consumptions in recent experimental methods for predicting ncRNA and protein interactions, it is essential to propose an innovative and practical approach with convincing performance of prediction accuracy. In this study, based on the protein sequences from a biological perspective, we put forward an effective deep learning method, named BGFE, to predict ncRNA and protein interactions. Protein sequences are represented by bi-gram probability feature extraction method from Position Specific Scoring Matrix (PSSM), and for ncRNA sequences, k-mers sparse matrices are employed to represent them. Furthermore, to extract hidden high-level feature information, a stacked auto-encoder network is employed with the stacked ensemble integration strategy. We evaluate the performance of the proposed method by using three datasets and a five-fold cross-validation after classifying the features through the random forest classifier. The experimental results clearly demonstrate the effectiveness and the prediction accuracy of our approach. In general, the proposed method is helpful for ncRNA and protein interacting predictions and it provides some serviceable guidance in future biological research.
KW - Bi-gram
KW - Deep learning
KW - K-mers
KW - NcRNA-protein interaction
KW - Position specific scoring matrix
UR - http://www.scopus.com/inward/record.url?scp=85065122211&partnerID=8YFLogxK
U2 - 10.3390/ijms20040978
DO - 10.3390/ijms20040978
M3 - 文章
C2 - 30813451
AN - SCOPUS:85065122211
SN - 1661-6596
VL - 20
JO - International Journal of Molecular Sciences
JF - International Journal of Molecular Sciences
IS - 4
M1 - 978
ER -