A fast and efficient count-based matrix factorization method for detecting cell types from single-cell RNAseq data

Shiquan Sun, Yabo Chen, Yang Liu, Xuequn Shang

科研成果: 期刊稿件文章同行评审

14 引用 (Scopus)

摘要

Background: Single-cell RNA sequencing (scRNAseq) data always involves various unwanted variables, which would be able to mask the true signal to identify cell-types. More efficient way of dealing with this issue is to extract low dimension information from high dimensional gene expression data to represent cell-type structure. In the past two years, several powerful matrix factorization tools were developed for scRNAseq data, such as NMF, ZIFA, pCMF and ZINB-WaVE. But the existing approaches either are unable to directly model the raw count of scRNAseq data or are really time-consuming when handling a large number of cells (e.g. n>500). Results: In this paper, we developed a fast and efficient count-based matrix factorization method (single-cell negative binomial matrix factorization, scNBMF) based on the TensorFlow framework to infer the low dimensional structure of cell types. To make our method scalable, we conducted a series of experiments on three public scRNAseq data sets, brain, embryonic stem, and pancreatic islet. The experimental results show that scNBMF is more powerful to detect cell types and 10 - 100 folds faster than the scRNAseq bespoke tools. Conclusions: In this paper, we proposed a fast and efficient count-based matrix factorization method, scNBMF, which is more powerful for detecting cell type purposes. A series of experiments were performed on three public scRNAseq data sets. The results show that scNBMF is a more powerful tool in large-scale scRNAseq data analysis. scNBMF was implemented in R and Python, and the source code are freely available at https://github.com/sqsun.

源语言英语
文章编号28
期刊BMC Systems Biology
13
DOI
出版状态已出版 - 5 4月 2019

指纹

探究 'A fast and efficient count-based matrix factorization method for detecting cell types from single-cell RNAseq data' 的科研主题。它们共同构成独一无二的指纹。

引用此