A novel variational Bayesian method for variable selection in logistic regression models

Chun Xia Zhang, Shuang Xu, Jiang She Zhang

Research output: Contribution to journalArticlepeer-review

26 Scopus citations

Abstract

With high-dimensional data emerging in various domains, sparse logistic regression models have gained much interest of researchers. Variable selection plays a key role in both improving the prediction accuracy and enhancing the interpretability of built models. Bayesian variable selection approaches enjoy many advantages such as high selection accuracy, easily incorporating many kinds of prior knowledge and so on. Because Bayesian methods generally make inference from the posterior distribution with Markov Chain Monte Carlo (MCMC) techniques, however, they become intractable in high-dimensional situations due to the large searching space. To address this issue, a novel variational Bayesian method for variable selection in high-dimensional logistic regression models is presented. The proposed method is based on the indicator model in which each covariate is equipped with a binary latent variable indicating whether it is important. The Bernoulli-type prior is adopted for the latent indicator variable. As for the specification of the hyperparameter in the Bernoulli prior, we provide two schemes to determine its optimal value so that the novel model can achieve sparsity adaptively. To identify important variables and make predictions, one efficient variational Bayesian approach is employed to make inference from the posterior distribution. The experiments conducted with both synthetic and some publicly available data show that the new method outperforms or is very competitive with some other popular counterparts.

Original languageEnglish
Pages (from-to)1-19
Number of pages19
JournalComputational Statistics and Data Analysis
Volume133
DOIs
StatePublished - May 2019
Externally publishedYes

Keywords

  • High-dimensional data
  • Indicator model
  • Logistic regression
  • Sparse model
  • Variable selection
  • Variational Bayes

Fingerprint

Dive into the research topics of 'A novel variational Bayesian method for variable selection in logistic regression models'. Together they form a unique fingerprint.

Cite this