Enhanced RNA Sequence Representation through Sequence Masking and Subsequence Consistency Optimization

Yewei Shen, Zhiyuan Wang, Zongyu Li, Xinmeng Liu, Xuequn Shang, Yongtian Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

In the burgeoning field of RNA research, accurate and efficient RNA sequence representation remains a pivotal challenge, exacerbated by the complexity and diversity of RNA sequences. Addressing the critical need for enhanced sequence representation and the issues of sequence context and structural alignment, this study introduces a novel, comprehensive approach. The proposed model seamlessly integrates sequence masking and subsequence consistency optimization, offering a robust solution to the intricate problem of RNA sequence representation. Utilizing the filtered RNAStralign dataset, encompassing 20,923 sequences, the model's performance is rigorously evaluated employing a Support Vector Machine (SVM) for subsequent RNA family classification tasks. Despite the inherent imbalance in RNA family sequence distribution, the model demonstrates exemplary performance, achieving high classification accuracy and AUPRC values across diverse RNA sequence groups. This balanced and unbiased assessment, ensured by the use of AUPRC as an evaluation metric, highlights the model's practical utility for comprehensive RNA sequence analysis and classification. In essence, this research presents a method for enhanced RNA sequence representation and laying a robust foundation for future advancements in the nuanced field of RNA sequence analysis.

Original languageEnglish
Title of host publicationProceedings - 2023 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023
EditorsXingpeng Jiang, Haiying Wang, Reda Alhajj, Xiaohua Hu, Felix Engel, Mufti Mahmud, Nadia Pisanti, Xuefeng Cui, Hong Song
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2938-2944
Number of pages7
ISBN (Electronic)9798350337488
DOIs
StatePublished - 2023
Event2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 - Istanbul, Turkey
Duration: 5 Dec 20238 Dec 2023

Publication series

NameProceedings - 2023 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023

Conference

Conference2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023
Country/TerritoryTurkey
CityIstanbul
Period5/12/238/12/23

Keywords

  • RNA family classification
  • sequence masking
  • sequence representation
  • subsequence consistency optimization

Fingerprint

Dive into the research topics of 'Enhanced RNA Sequence Representation through Sequence Masking and Subsequence Consistency Optimization'. Together they form a unique fingerprint.

Cite this