Variational bayesian dropout with a hierarchical prior

Yuhang Liu; Wenyong Dong; Lei Zhang; Dong Gong; Qinfeng Shi

doi:10.1109/CVPR.2019.00729

Variational bayesian dropout with a hierarchical prior

Yuhang Liu, Wenyong Dong, Lei Zhang, Dong Gong, Qinfeng Shi

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

19 引用（Scopus）

摘要

Variational dropout (VD) is a generalization of Gaussian dropout, which aims at inferring the posterior of network weights based on a log-uniform prior on them to learn these weights as well as dropout rate simultaneously. The log-uniform prior not only interprets the regularization capacity of Gaussian dropout in network training, but also underpins the inference of such posterior. However, the log-uniform prior is an improper prior (i.e., its integral is infinite), which causes the inference of posterior to be ill-posed, thus restricting the regularization performance of VD. To address this problem, we present a new generalization of Gaussian dropout, termed variational Bayesian dropout (VBD), which turns to exploit a hierarchical prior on the network weights and infer a new joint posterior. Specifically, we implement the hierarchical prior as a zero-mean Gaussian distribution with variance sampled from a uniform hyper-prior. Then, we incorporate such a prior into inferring the joint posterior over network weights and the variance in the hierarchical prior, with which both the network training and dropout rate estimation can be cast into a joint optimization problem. More importantly, the hierarchical prior is a proper prior which enables the inference of posterior to be well-posed. In addition, we further show that the proposed VBD can be seamlessly applied to network compression. Experiments on classification and network compression demonstrate the superior performance of the proposed VBD in regularizing network training.

源语言	英语
主期刊名	Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
出版商	IEEE Computer Society
页	7117-7126
页数	10
ISBN（电子版）	9781728132938
DOI	https://doi.org/10.1109/CVPR.2019.00729
出版状态	已出版 - 6月 2019
已对外发布	是
活动	32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, 美国期限: 16 6月 2019 → 20 6月 2019

出版系列

姓名	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
卷	2019-June
ISSN（印刷版）	1063-6919

会议

会议	32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
国家/地区	美国
市	Long Beach
时期	16/06/19 → 20/06/19

访问文件

10.1109/CVPR.2019.00729

其它文件与链接

链接到 Scopus 的出版物

引用此

Liu, Y., Dong, W., Zhang, L., Gong, D., & Shi, Q. (2019). Variational bayesian dropout with a hierarchical prior. 在 Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 (页码 7117-7126). 文章 8953277 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; 卷 2019-June). IEEE Computer Society. https://doi.org/10.1109/CVPR.2019.00729

@inproceedings{fbdf04b56ded4ad48e9a93d8450e511e,

title = "Variational bayesian dropout with a hierarchical prior",

abstract = "Variational dropout (VD) is a generalization of Gaussian dropout, which aims at inferring the posterior of network weights based on a log-uniform prior on them to learn these weights as well as dropout rate simultaneously. The log-uniform prior not only interprets the regularization capacity of Gaussian dropout in network training, but also underpins the inference of such posterior. However, the log-uniform prior is an improper prior (i.e., its integral is infinite), which causes the inference of posterior to be ill-posed, thus restricting the regularization performance of VD. To address this problem, we present a new generalization of Gaussian dropout, termed variational Bayesian dropout (VBD), which turns to exploit a hierarchical prior on the network weights and infer a new joint posterior. Specifically, we implement the hierarchical prior as a zero-mean Gaussian distribution with variance sampled from a uniform hyper-prior. Then, we incorporate such a prior into inferring the joint posterior over network weights and the variance in the hierarchical prior, with which both the network training and dropout rate estimation can be cast into a joint optimization problem. More importantly, the hierarchical prior is a proper prior which enables the inference of posterior to be well-posed. In addition, we further show that the proposed VBD can be seamlessly applied to network compression. Experiments on classification and network compression demonstrate the superior performance of the proposed VBD in regularizing network training.",

keywords = "Categorization, Deep Learning, Recognition: Detection, Retrieval, Statistical Learning",

author = "Yuhang Liu and Wenyong Dong and Lei Zhang and Dong Gong and Qinfeng Shi",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 ; Conference date: 16-06-2019 Through 20-06-2019",

year = "2019",

month = jun,

doi = "10.1109/CVPR.2019.00729",

language = "英语",

series = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

publisher = "IEEE Computer Society",

pages = "7117--7126",

booktitle = "Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019",

}

Liu, Y, Dong, W, Zhang, L, Gong, D & Shi, Q 2019, Variational bayesian dropout with a hierarchical prior. 在 Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019., 8953277, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 卷 2019-June, IEEE Computer Society, 页码 7117-7126, 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, 美国, 16/06/19. https://doi.org/10.1109/CVPR.2019.00729

Variational bayesian dropout with a hierarchical prior. / Liu, Yuhang; Dong, Wenyong; Zhang, Lei 等.
Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019. IEEE Computer Society, 2019. 页码 7117-7126 8953277 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; 卷 2019-June).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Variational bayesian dropout with a hierarchical prior

AU - Liu, Yuhang

AU - Dong, Wenyong

AU - Zhang, Lei

AU - Gong, Dong

AU - Shi, Qinfeng

PY - 2019/6

Y1 - 2019/6

N2 - Variational dropout (VD) is a generalization of Gaussian dropout, which aims at inferring the posterior of network weights based on a log-uniform prior on them to learn these weights as well as dropout rate simultaneously. The log-uniform prior not only interprets the regularization capacity of Gaussian dropout in network training, but also underpins the inference of such posterior. However, the log-uniform prior is an improper prior (i.e., its integral is infinite), which causes the inference of posterior to be ill-posed, thus restricting the regularization performance of VD. To address this problem, we present a new generalization of Gaussian dropout, termed variational Bayesian dropout (VBD), which turns to exploit a hierarchical prior on the network weights and infer a new joint posterior. Specifically, we implement the hierarchical prior as a zero-mean Gaussian distribution with variance sampled from a uniform hyper-prior. Then, we incorporate such a prior into inferring the joint posterior over network weights and the variance in the hierarchical prior, with which both the network training and dropout rate estimation can be cast into a joint optimization problem. More importantly, the hierarchical prior is a proper prior which enables the inference of posterior to be well-posed. In addition, we further show that the proposed VBD can be seamlessly applied to network compression. Experiments on classification and network compression demonstrate the superior performance of the proposed VBD in regularizing network training.

AB - Variational dropout (VD) is a generalization of Gaussian dropout, which aims at inferring the posterior of network weights based on a log-uniform prior on them to learn these weights as well as dropout rate simultaneously. The log-uniform prior not only interprets the regularization capacity of Gaussian dropout in network training, but also underpins the inference of such posterior. However, the log-uniform prior is an improper prior (i.e., its integral is infinite), which causes the inference of posterior to be ill-posed, thus restricting the regularization performance of VD. To address this problem, we present a new generalization of Gaussian dropout, termed variational Bayesian dropout (VBD), which turns to exploit a hierarchical prior on the network weights and infer a new joint posterior. Specifically, we implement the hierarchical prior as a zero-mean Gaussian distribution with variance sampled from a uniform hyper-prior. Then, we incorporate such a prior into inferring the joint posterior over network weights and the variance in the hierarchical prior, with which both the network training and dropout rate estimation can be cast into a joint optimization problem. More importantly, the hierarchical prior is a proper prior which enables the inference of posterior to be well-posed. In addition, we further show that the proposed VBD can be seamlessly applied to network compression. Experiments on classification and network compression demonstrate the superior performance of the proposed VBD in regularizing network training.

KW - Categorization

KW - Deep Learning

KW - Recognition: Detection

KW - Retrieval

KW - Statistical Learning

UR - http://www.scopus.com/inward/record.url?scp=85075449414&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2019.00729

DO - 10.1109/CVPR.2019.00729

M3 - 会议稿件

AN - SCOPUS:85075449414

T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SP - 7117

EP - 7126

BT - Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019

PB - IEEE Computer Society

T2 - 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019

Y2 - 16 June 2019 through 20 June 2019

ER -

Liu Y, Dong W, Zhang L, Gong D, Shi Q. Variational bayesian dropout with a hierarchical prior. 在 Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019. IEEE Computer Society. 2019. 页码 7117-7126. 8953277. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). doi: 10.1109/CVPR.2019.00729

Variational bayesian dropout with a hierarchical prior

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此