TY - JOUR
T1 - Patch attention network with generative adversarial model for semi-supervised binocular disparity prediction
AU - Rao, Zhibo
AU - He, Mingyi
AU - Dai, Yuchao
AU - Shen, Zhelun
N1 - Publisher Copyright:
© 2020, Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2022/1
Y1 - 2022/1
N2 - In this paper, we address the challenging points of binocular disparity estimation: (1) unsatisfactory results in the occluded region when utilizing warping function in unsupervised learning; (2) inefficiency in running time and the number of parameters as adopting a lot of 3D convolutions in the feature matching module. To solve these drawbacks, we propose a patch attention network for semi-supervised stereo matching learning. First, we employ a channel-attention mechanism to aggregate the cost volume by selecting its different surfaces for reducing a large number of 3D convolution, called the patch attention network (PA-Net). Second, we use our proposed PA-Net as a generator and then combine it, traditional unsupervised learning loss, and the adversarial learning model to construct a semi-supervised learning framework for improving performance in the occluded areas. We have trained our PA-Net in supervised learning, semi-supervised learning, and unsupervised learning manners. Extensive experiments show that (1) our semi-supervised learning framework can overcome the drawbacks of unsupervised learning and significantly improve the performance in the ill-posed region by using only a few or inaccurate ground truths; (2) our PA-Net can outperform other state-of-the-art approaches in supervised learning and use fewer parameters.
AB - In this paper, we address the challenging points of binocular disparity estimation: (1) unsatisfactory results in the occluded region when utilizing warping function in unsupervised learning; (2) inefficiency in running time and the number of parameters as adopting a lot of 3D convolutions in the feature matching module. To solve these drawbacks, we propose a patch attention network for semi-supervised stereo matching learning. First, we employ a channel-attention mechanism to aggregate the cost volume by selecting its different surfaces for reducing a large number of 3D convolution, called the patch attention network (PA-Net). Second, we use our proposed PA-Net as a generator and then combine it, traditional unsupervised learning loss, and the adversarial learning model to construct a semi-supervised learning framework for improving performance in the occluded areas. We have trained our PA-Net in supervised learning, semi-supervised learning, and unsupervised learning manners. Extensive experiments show that (1) our semi-supervised learning framework can overcome the drawbacks of unsupervised learning and significantly improve the performance in the ill-posed region by using only a few or inaccurate ground truths; (2) our PA-Net can outperform other state-of-the-art approaches in supervised learning and use fewer parameters.
KW - Binocular disparity estimation
KW - Generative adversarial model
KW - Patch attention mechanism
KW - Semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85096025370&partnerID=8YFLogxK
U2 - 10.1007/s00371-020-02001-5
DO - 10.1007/s00371-020-02001-5
M3 - 文章
AN - SCOPUS:85096025370
SN - 0178-2789
VL - 38
SP - 77
EP - 93
JO - Visual Computer
JF - Visual Computer
IS - 1
ER -