Enhancing cell subpopulation discovery in cancer by integrating single-cell transcriptome and expressed variants

Tao Wang, Duoduo Mai, Han Shu, Jialu Hu, Yongtian Wang, Jiajie Peng, Jing Chen, Xuequn Shang

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

The emergence of single-cell RNA sequencing (scRNA-seq) technology has revolutionized the study of cellular heterogeneity at the single-cell level. However, existing methods for identifying subpopulations of cells in scRNA-seq data mainly rely on gene expression features, neglecting the valuable genomic information present in the raw sequencing data. To address this limitation, we propose an end-to-end deep clustering model called scCluster, which integrates single-cell gene expression profiles and expressed variant features derived from the raw scRNA-seq data to stratify cell subpopulations in cancer tissues. scCluster employs a joint optimization strategy that combines a zero-inflated negative binomial model-based dual-modal autoencoder with deep embedding clustering in the pre-training phase. This allows both gene expression profiles and variant features to be encoded into the same latent embedding space. In the fine-tuning stage, scCluster further enhances the discriminability of the latent representations by integrating deep soft K-means clustering and cross-instance guided contrastive clustering techniques. Our extensive evaluations reveal that scCluster surpasses state-of-the-art methods in multiple real-world cancer scRNA-seq datasets. The results also indicate that incorporating the expressed variant features alongside gene expressions substantially enhances the stratification of cell subpopulations in cancer single-cell research.

Original languageEnglish
JournalFundamental Research
DOIs
StateAccepted/In press - 2025

Keywords

  • Cancer
  • Deep learning
  • Expressed variants
  • Multi-omics data integration
  • Single-cell subpopulation

Fingerprint

Dive into the research topics of 'Enhancing cell subpopulation discovery in cancer by integrating single-cell transcriptome and expressed variants'. Together they form a unique fingerprint.

Cite this