Abstract
The rapid advancement of scRNA-seq has generated massive data for cell type annotation. However, current automated annotation methods remain limited: most approaches separately model either cell-cell similarities or gene-gene relationships, neglecting their synergistic effects, which leads to suboptimal accuracy and poor biological interpretability. To address this, we propose scProGraph, a prototype-guided graph neural network that jointly models cell type classification and functional gene subgraph discovery. By constructing a cell similarity graph and incorporating cell-type prototypes as prior anchors, our method simultaneously optimizes classification boundaries and the interpretability of gene subgraphs. Experiments on seven independent datasets spanning three disease categories demonstrate that scProGraph achieves over 90% accuracy on four datasets and exceeds 80% on six datasets, outperforming state-of-the-art methods. Further analysis reveals that the gene subgraphs extracted by scProGraph for Macrophage, Fibroblast, and Monocyte cover 26.92%, 26.83%, and 22.22% of a protein-protein interaction networks dataset, respectively, validating the biological relevance of the identified gene modules. This study not only provides a high-accuracy tool for single-cell annotation but also opens new avenues for discovering novel biomarkers and regulatory mechanisms through gene relationship mining.
| Original language | English |
|---|---|
| Pages (from-to) | 147-158 |
| Number of pages | 12 |
| Journal | IEEE Transactions on Big Data |
| Volume | 12 |
| Issue number | 1 |
| DOIs | |
| State | Accepted/In press - 2025 |
Keywords
- Single-cell RNA-seq
- cell type annotation
- explanation
- gene-gene graph
- prototype learning