Constrained query of order-preserving submatrix in gene expression data

Tao Jiang, Zhanhuai Li, Xuequn Shang, Bolin Chen, Weibang Li, Zhilei Yin

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Order-preserving submatrix (OPSM) has become important in modelling biologically meaningful subspace cluster, capturing the general tendency of gene expressions across a subset of conditions. With the advance of microarray and analysis techniques, big volume of gene expression datasets and OPSM mining results are produced. OPSM query can efficiently retrieve relevant OPSMs from the huge amount of OPSM datasets. However, improving OPSM query relevancy remains a difficult task in real life exploratory data analysis processing. First, it is hard to capture subjective interestingness aspects, e.g., the analyst’s expectation given her/his domain knowledge. Second, when these expectations can be declaratively specified, it is still challenging to use them during the computational process of OPSM queries. With the best of our knowledge, existing methods mainly focus on batch OPSM mining, while few works involve OPSM query. To solve the above problems, the paper proposes two constrained OPSM query methods, which exploit userdefined constraints to search relevant results from two kinds of indices introduced. In this paper, extensive experiments are conducted on real datasets, and experiment results demonstrate that the multi-dimension index (cIndex) and enumerating sequence index (esIndex) based queries have better performance than brute force search.

Original languageEnglish
Pages (from-to)1052-1066
Number of pages15
JournalFrontiers of Computer Science
Volume10
Issue number6
DOIs
StatePublished - 1 Dec 2016

Keywords

  • brute-force search
  • cIndex
  • constrained query
  • feature sequence
  • gene expression data
  • OPSM

Fingerprint

Dive into the research topics of 'Constrained query of order-preserving submatrix in gene expression data'. Together they form a unique fingerprint.

Cite this