Robust sparse regression by modeling noise as a mixture of gaussians

Shuang Xu; Chun Xia Zhang

doi:10.1080/02664763.2019.1566448

Robust sparse regression by modeling noise as a mixture of gaussians

Shuang Xu, Chun Xia Zhang

Xi'an Jiaotong University

科研成果: 期刊稿件 › 文章 › 同行评审

6 引用（Scopus）

摘要

Regression analysis has been proven to be a quite effective tool in a large variety of fields. In many regression models, it is often assumed that noise is with a specific distribution. Although the theoretical analysis can be greatly facilitated, the model-fitting performance may be poor since the supposed noise distribution may deviate from real noise to a large extent. Meanwhile, the model is also expected to be robust in consideration of the complexity of real-world data. Without any assumption about noise, we propose in this paper a novel sparse regression method called MoG-Lasso to directly model noise in linear regression models via a mixture of Gaussian distributions (MoG). Meanwhile, the L₁ penalty is included as a part of the loss function of MoG-Lasso to enhance its ability to identify a sparse model. As for the parameters in MoG-Lasso, we present an efficient algorithm to estimate them via the EM (expectation maximization) and ADMM (alternating direction method of multipliers) algorithms. With some simulated and real data contaminated by complex noise, the experiments show that the novel model MoG-Lasso performs better than several other popular methods in both ‘p>n’ and ‘p<n’ situations, including Lasso, LAD-Lasso and Huber-Lasso.

源语言	英语
页（从-至）	1738-1755
页数	18
期刊	Journal of Applied Statistics
卷	46
期	10
DOI	https://doi.org/10.1080/02664763.2019.1566448
出版状态	已出版 - 27 7月 2019
已对外发布	是

访问文件

10.1080/02664763.2019.1566448

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{73f7497b76f141928287d9bbcca6ddb4,

title = "Robust sparse regression by modeling noise as a mixture of gaussians",

abstract = "Regression analysis has been proven to be a quite effective tool in a large variety of fields. In many regression models, it is often assumed that noise is with a specific distribution. Although the theoretical analysis can be greatly facilitated, the model-fitting performance may be poor since the supposed noise distribution may deviate from real noise to a large extent. Meanwhile, the model is also expected to be robust in consideration of the complexity of real-world data. Without any assumption about noise, we propose in this paper a novel sparse regression method called MoG-Lasso to directly model noise in linear regression models via a mixture of Gaussian distributions (MoG). Meanwhile, the L1 penalty is included as a part of the loss function of MoG-Lasso to enhance its ability to identify a sparse model. As for the parameters in MoG-Lasso, we present an efficient algorithm to estimate them via the EM (expectation maximization) and ADMM (alternating direction method of multipliers) algorithms. With some simulated and real data contaminated by complex noise, the experiments show that the novel model MoG-Lasso performs better than several other popular methods in both {\textquoteleft}p>n{\textquoteright} and {\textquoteleft}p",

keywords = "lasso, mixture of Gaussians, penalized regression, Robust regression, variable selection",

author = "Shuang Xu and Zhang, {Chun Xia}",

note = "Publisher Copyright: {\textcopyright} 2019, {\textcopyright} 2019 Informa UK Limited, trading as Taylor & Francis Group.",

year = "2019",

month = jul,

day = "27",

doi = "10.1080/02664763.2019.1566448",

language = "英语",

volume = "46",

pages = "1738--1755",

journal = "Journal of Applied Statistics",

issn = "0266-4763",

publisher = "Routledge",

number = "10",

}

TY - JOUR

T1 - Robust sparse regression by modeling noise as a mixture of gaussians

AU - Xu, Shuang

AU - Zhang, Chun Xia

PY - 2019/7/27

Y1 - 2019/7/27

N2 - Regression analysis has been proven to be a quite effective tool in a large variety of fields. In many regression models, it is often assumed that noise is with a specific distribution. Although the theoretical analysis can be greatly facilitated, the model-fitting performance may be poor since the supposed noise distribution may deviate from real noise to a large extent. Meanwhile, the model is also expected to be robust in consideration of the complexity of real-world data. Without any assumption about noise, we propose in this paper a novel sparse regression method called MoG-Lasso to directly model noise in linear regression models via a mixture of Gaussian distributions (MoG). Meanwhile, the L1 penalty is included as a part of the loss function of MoG-Lasso to enhance its ability to identify a sparse model. As for the parameters in MoG-Lasso, we present an efficient algorithm to estimate them via the EM (expectation maximization) and ADMM (alternating direction method of multipliers) algorithms. With some simulated and real data contaminated by complex noise, the experiments show that the novel model MoG-Lasso performs better than several other popular methods in both ‘p>n’ and ‘p

AB - Regression analysis has been proven to be a quite effective tool in a large variety of fields. In many regression models, it is often assumed that noise is with a specific distribution. Although the theoretical analysis can be greatly facilitated, the model-fitting performance may be poor since the supposed noise distribution may deviate from real noise to a large extent. Meanwhile, the model is also expected to be robust in consideration of the complexity of real-world data. Without any assumption about noise, we propose in this paper a novel sparse regression method called MoG-Lasso to directly model noise in linear regression models via a mixture of Gaussian distributions (MoG). Meanwhile, the L1 penalty is included as a part of the loss function of MoG-Lasso to enhance its ability to identify a sparse model. As for the parameters in MoG-Lasso, we present an efficient algorithm to estimate them via the EM (expectation maximization) and ADMM (alternating direction method of multipliers) algorithms. With some simulated and real data contaminated by complex noise, the experiments show that the novel model MoG-Lasso performs better than several other popular methods in both ‘p>n’ and ‘p

KW - lasso

KW - mixture of Gaussians

KW - penalized regression

KW - Robust regression

KW - variable selection

UR - http://www.scopus.com/inward/record.url?scp=85059957359&partnerID=8YFLogxK

U2 - 10.1080/02664763.2019.1566448

DO - 10.1080/02664763.2019.1566448

M3 - 文章

AN - SCOPUS:85059957359

SN - 0266-4763

VL - 46

SP - 1738

EP - 1755

JO - Journal of Applied Statistics

JF - Journal of Applied Statistics

IS - 10

ER -

Robust sparse regression by modeling noise as a mixture of gaussians

摘要

访问文件

其它文件与链接

指纹

引用此