Robust sparse regression by modeling noise as a mixture of gaussians

Shuang Xu; Chun Xia Zhang

doi:10.1080/02664763.2019.1566448

Robust sparse regression by modeling noise as a mixture of gaussians

Shuang Xu, Chun Xia Zhang

Xi'an Jiaotong University

Research output: Contribution to journal › Article › peer-review

6 Scopus citations

Abstract

Regression analysis has been proven to be a quite effective tool in a large variety of fields. In many regression models, it is often assumed that noise is with a specific distribution. Although the theoretical analysis can be greatly facilitated, the model-fitting performance may be poor since the supposed noise distribution may deviate from real noise to a large extent. Meanwhile, the model is also expected to be robust in consideration of the complexity of real-world data. Without any assumption about noise, we propose in this paper a novel sparse regression method called MoG-Lasso to directly model noise in linear regression models via a mixture of Gaussian distributions (MoG). Meanwhile, the L₁ penalty is included as a part of the loss function of MoG-Lasso to enhance its ability to identify a sparse model. As for the parameters in MoG-Lasso, we present an efficient algorithm to estimate them via the EM (expectation maximization) and ADMM (alternating direction method of multipliers) algorithms. With some simulated and real data contaminated by complex noise, the experiments show that the novel model MoG-Lasso performs better than several other popular methods in both ‘p>n’ and ‘p<n’ situations, including Lasso, LAD-Lasso and Huber-Lasso.

Original language	English
Pages (from-to)	1738-1755
Number of pages	18
Journal	Journal of Applied Statistics
Volume	46
Issue number	10
DOIs	https://doi.org/10.1080/02664763.2019.1566448
State	Published - 27 Jul 2019
Externally published	Yes

Keywords

lasso
mixture of Gaussians
penalized regression
Robust regression
variable selection

Access to Document

10.1080/02664763.2019.1566448

Cite this

@article{73f7497b76f141928287d9bbcca6ddb4,

title = "Robust sparse regression by modeling noise as a mixture of gaussians",

abstract = "Regression analysis has been proven to be a quite effective tool in a large variety of fields. In many regression models, it is often assumed that noise is with a specific distribution. Although the theoretical analysis can be greatly facilitated, the model-fitting performance may be poor since the supposed noise distribution may deviate from real noise to a large extent. Meanwhile, the model is also expected to be robust in consideration of the complexity of real-world data. Without any assumption about noise, we propose in this paper a novel sparse regression method called MoG-Lasso to directly model noise in linear regression models via a mixture of Gaussian distributions (MoG). Meanwhile, the L1 penalty is included as a part of the loss function of MoG-Lasso to enhance its ability to identify a sparse model. As for the parameters in MoG-Lasso, we present an efficient algorithm to estimate them via the EM (expectation maximization) and ADMM (alternating direction method of multipliers) algorithms. With some simulated and real data contaminated by complex noise, the experiments show that the novel model MoG-Lasso performs better than several other popular methods in both {\textquoteleft}p>n{\textquoteright} and {\textquoteleft}p",

keywords = "lasso, mixture of Gaussians, penalized regression, Robust regression, variable selection",

author = "Shuang Xu and Zhang, {Chun Xia}",

note = "Publisher Copyright: {\textcopyright} 2019, {\textcopyright} 2019 Informa UK Limited, trading as Taylor & Francis Group.",

year = "2019",

month = jul,

day = "27",

doi = "10.1080/02664763.2019.1566448",

language = "英语",

volume = "46",

pages = "1738--1755",

journal = "Journal of Applied Statistics",

issn = "0266-4763",

publisher = "Routledge",

number = "10",

}

TY - JOUR

T1 - Robust sparse regression by modeling noise as a mixture of gaussians

AU - Xu, Shuang

AU - Zhang, Chun Xia

PY - 2019/7/27

Y1 - 2019/7/27

N2 - Regression analysis has been proven to be a quite effective tool in a large variety of fields. In many regression models, it is often assumed that noise is with a specific distribution. Although the theoretical analysis can be greatly facilitated, the model-fitting performance may be poor since the supposed noise distribution may deviate from real noise to a large extent. Meanwhile, the model is also expected to be robust in consideration of the complexity of real-world data. Without any assumption about noise, we propose in this paper a novel sparse regression method called MoG-Lasso to directly model noise in linear regression models via a mixture of Gaussian distributions (MoG). Meanwhile, the L1 penalty is included as a part of the loss function of MoG-Lasso to enhance its ability to identify a sparse model. As for the parameters in MoG-Lasso, we present an efficient algorithm to estimate them via the EM (expectation maximization) and ADMM (alternating direction method of multipliers) algorithms. With some simulated and real data contaminated by complex noise, the experiments show that the novel model MoG-Lasso performs better than several other popular methods in both ‘p>n’ and ‘p

AB - Regression analysis has been proven to be a quite effective tool in a large variety of fields. In many regression models, it is often assumed that noise is with a specific distribution. Although the theoretical analysis can be greatly facilitated, the model-fitting performance may be poor since the supposed noise distribution may deviate from real noise to a large extent. Meanwhile, the model is also expected to be robust in consideration of the complexity of real-world data. Without any assumption about noise, we propose in this paper a novel sparse regression method called MoG-Lasso to directly model noise in linear regression models via a mixture of Gaussian distributions (MoG). Meanwhile, the L1 penalty is included as a part of the loss function of MoG-Lasso to enhance its ability to identify a sparse model. As for the parameters in MoG-Lasso, we present an efficient algorithm to estimate them via the EM (expectation maximization) and ADMM (alternating direction method of multipliers) algorithms. With some simulated and real data contaminated by complex noise, the experiments show that the novel model MoG-Lasso performs better than several other popular methods in both ‘p>n’ and ‘p

KW - lasso

KW - mixture of Gaussians

KW - penalized regression

KW - Robust regression

KW - variable selection

UR - http://www.scopus.com/inward/record.url?scp=85059957359&partnerID=8YFLogxK

U2 - 10.1080/02664763.2019.1566448

DO - 10.1080/02664763.2019.1566448

M3 - 文章

AN - SCOPUS:85059957359

SN - 0266-4763

VL - 46

SP - 1738

EP - 1755

JO - Journal of Applied Statistics

JF - Journal of Applied Statistics

IS - 10

ER -

Robust sparse regression by modeling noise as a mixture of gaussians

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this