摘要
Analyzing RNA secondary structures plays a crucial role in elucidating the functional mechanisms of RNA. Despite advances in RNA structure determination, these methods are low throughout and resource-intensive. While machine learning-based models have achieved remarkable performance in terms of prediction accuracy, challenges such as data scarcity and overfitting remain common. Here, we introduce a phased learning strategy that integrates RNA sequence and structural context information to mitigate the risk of overfitting and employs pairing constraints to train the model on folding scores. This approach effectively addresses both local and long-range nucleotide interactions, substantially improving the robustness of RNA secondary structure predictions. Our comprehensive analysis across multiple benchmarking datasets demonstrated that the performance of our model (DSRNAFold) was superior to that of existing methods, especially in pseudoknot recognition and chemical mapping activity prediction, where our approach showed positive performance.
源语言 | 英语 |
---|---|
文章编号 | gkaf533 |
期刊 | Nucleic Acids Research |
卷 | 53 |
期 | 11 |
DOI | |
出版状态 | 已出版 - 24 6月 2025 |