Patch attention network with generative adversarial model for semi-supervised binocular disparity prediction

Zhibo Rao; Mingyi He; Yuchao Dai; Zhelun Shen

doi:10.1007/s00371-020-02001-5

Patch attention network with generative adversarial model for semi-supervised binocular disparity prediction

Zhibo Rao, Mingyi He, Yuchao Dai, Zhelun Shen

电子信息学院

科研成果: 期刊稿件 › 文章 › 同行评审

4 引用（Scopus）

摘要

In this paper, we address the challenging points of binocular disparity estimation: (1) unsatisfactory results in the occluded region when utilizing warping function in unsupervised learning; (2) inefficiency in running time and the number of parameters as adopting a lot of 3D convolutions in the feature matching module. To solve these drawbacks, we propose a patch attention network for semi-supervised stereo matching learning. First, we employ a channel-attention mechanism to aggregate the cost volume by selecting its different surfaces for reducing a large number of 3D convolution, called the patch attention network (PA-Net). Second, we use our proposed PA-Net as a generator and then combine it, traditional unsupervised learning loss, and the adversarial learning model to construct a semi-supervised learning framework for improving performance in the occluded areas. We have trained our PA-Net in supervised learning, semi-supervised learning, and unsupervised learning manners. Extensive experiments show that (1) our semi-supervised learning framework can overcome the drawbacks of unsupervised learning and significantly improve the performance in the ill-posed region by using only a few or inaccurate ground truths; (2) our PA-Net can outperform other state-of-the-art approaches in supervised learning and use fewer parameters.

源语言	英语
页（从-至）	77-93
页数	17
期刊	Visual Computer
卷	38
期	1
DOI	https://doi.org/10.1007/s00371-020-02001-5
出版状态	已出版 - 1月 2022

访问文件

10.1007/s00371-020-02001-5

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{6933d927f70347949dfce3b67d2a3f7f,

title = "Patch attention network with generative adversarial model for semi-supervised binocular disparity prediction",

abstract = "In this paper, we address the challenging points of binocular disparity estimation: (1) unsatisfactory results in the occluded region when utilizing warping function in unsupervised learning; (2) inefficiency in running time and the number of parameters as adopting a lot of 3D convolutions in the feature matching module. To solve these drawbacks, we propose a patch attention network for semi-supervised stereo matching learning. First, we employ a channel-attention mechanism to aggregate the cost volume by selecting its different surfaces for reducing a large number of 3D convolution, called the patch attention network (PA-Net). Second, we use our proposed PA-Net as a generator and then combine it, traditional unsupervised learning loss, and the adversarial learning model to construct a semi-supervised learning framework for improving performance in the occluded areas. We have trained our PA-Net in supervised learning, semi-supervised learning, and unsupervised learning manners. Extensive experiments show that (1) our semi-supervised learning framework can overcome the drawbacks of unsupervised learning and significantly improve the performance in the ill-posed region by using only a few or inaccurate ground truths; (2) our PA-Net can outperform other state-of-the-art approaches in supervised learning and use fewer parameters.",

keywords = "Binocular disparity estimation, Generative adversarial model, Patch attention mechanism, Semi-supervised learning",

author = "Zhibo Rao and Mingyi He and Yuchao Dai and Zhelun Shen",

note = "Publisher Copyright: {\textcopyright} 2020, Springer-Verlag GmbH Germany, part of Springer Nature.",

year = "2022",

month = jan,

doi = "10.1007/s00371-020-02001-5",

language = "英语",

volume = "38",

pages = "77--93",

journal = "Visual Computer",

issn = "0178-2789",

publisher = "Springer Verlag",

number = "1",

}

TY - JOUR

T1 - Patch attention network with generative adversarial model for semi-supervised binocular disparity prediction

AU - Rao, Zhibo

AU - He, Mingyi

AU - Dai, Yuchao

AU - Shen, Zhelun

PY - 2022/1

Y1 - 2022/1

N2 - In this paper, we address the challenging points of binocular disparity estimation: (1) unsatisfactory results in the occluded region when utilizing warping function in unsupervised learning; (2) inefficiency in running time and the number of parameters as adopting a lot of 3D convolutions in the feature matching module. To solve these drawbacks, we propose a patch attention network for semi-supervised stereo matching learning. First, we employ a channel-attention mechanism to aggregate the cost volume by selecting its different surfaces for reducing a large number of 3D convolution, called the patch attention network (PA-Net). Second, we use our proposed PA-Net as a generator and then combine it, traditional unsupervised learning loss, and the adversarial learning model to construct a semi-supervised learning framework for improving performance in the occluded areas. We have trained our PA-Net in supervised learning, semi-supervised learning, and unsupervised learning manners. Extensive experiments show that (1) our semi-supervised learning framework can overcome the drawbacks of unsupervised learning and significantly improve the performance in the ill-posed region by using only a few or inaccurate ground truths; (2) our PA-Net can outperform other state-of-the-art approaches in supervised learning and use fewer parameters.

AB - In this paper, we address the challenging points of binocular disparity estimation: (1) unsatisfactory results in the occluded region when utilizing warping function in unsupervised learning; (2) inefficiency in running time and the number of parameters as adopting a lot of 3D convolutions in the feature matching module. To solve these drawbacks, we propose a patch attention network for semi-supervised stereo matching learning. First, we employ a channel-attention mechanism to aggregate the cost volume by selecting its different surfaces for reducing a large number of 3D convolution, called the patch attention network (PA-Net). Second, we use our proposed PA-Net as a generator and then combine it, traditional unsupervised learning loss, and the adversarial learning model to construct a semi-supervised learning framework for improving performance in the occluded areas. We have trained our PA-Net in supervised learning, semi-supervised learning, and unsupervised learning manners. Extensive experiments show that (1) our semi-supervised learning framework can overcome the drawbacks of unsupervised learning and significantly improve the performance in the ill-posed region by using only a few or inaccurate ground truths; (2) our PA-Net can outperform other state-of-the-art approaches in supervised learning and use fewer parameters.

KW - Binocular disparity estimation

KW - Generative adversarial model

KW - Patch attention mechanism

KW - Semi-supervised learning

UR - http://www.scopus.com/inward/record.url?scp=85096025370&partnerID=8YFLogxK

U2 - 10.1007/s00371-020-02001-5

DO - 10.1007/s00371-020-02001-5

M3 - 文章

AN - SCOPUS:85096025370

SN - 0178-2789

VL - 38

SP - 77

EP - 93

JO - Visual Computer

JF - Visual Computer

IS - 1

ER -

Patch attention network with generative adversarial model for semi-supervised binocular disparity prediction

摘要

访问文件

其它文件与链接

指纹

引用此