TY - GEN
T1 - Enhancing Self-supervised Monocular Depth Estimation via Incorporating Robust Constraints
AU - Li, Rui
AU - He, Xiantuo
AU - Zhu, Yu
AU - Li, Xianjun
AU - Sun, Jinqiu
AU - Zhang, Yanning
N1 - Publisher Copyright:
© 2020 ACM.
PY - 2020/10/12
Y1 - 2020/10/12
N2 - Self-supervised depth estimation has shown great prospects in inferring 3D structures using purely unannotated images. However, its performance usually drops when trained on the images with changing brightness and moving objects. In this paper, we address this issue by enhancing the robustness of the self-supervised paradigm using a set of image-based and geometry-based constraints. Our contributions are threefold, 1) we propose a gradient-based robust photometric loss which restrains the false supervisory signals caused by brightness changes, 2) we propose to filter out the unreliable areas that violate the rigid assumption by a novel combined selective mask, which is computed on the forward pass of the network by leveraging the inter-loss consistency and the loss-gradient consistency, and 3) we constrain the motion estimation network to generate across-frame consistent motions via proposing a triplet-based cycle consistency constraint. Extensive experiments conducted on KITTI, Cityscape and Make3D datasets demonstrate the superiority of our method, that the proposed method can effectively handle complex scenes with changing brightness and object motions. Both qualitative and quantitative results show that the proposed method outperforms the state-of-the-art methods.
AB - Self-supervised depth estimation has shown great prospects in inferring 3D structures using purely unannotated images. However, its performance usually drops when trained on the images with changing brightness and moving objects. In this paper, we address this issue by enhancing the robustness of the self-supervised paradigm using a set of image-based and geometry-based constraints. Our contributions are threefold, 1) we propose a gradient-based robust photometric loss which restrains the false supervisory signals caused by brightness changes, 2) we propose to filter out the unreliable areas that violate the rigid assumption by a novel combined selective mask, which is computed on the forward pass of the network by leveraging the inter-loss consistency and the loss-gradient consistency, and 3) we constrain the motion estimation network to generate across-frame consistent motions via proposing a triplet-based cycle consistency constraint. Extensive experiments conducted on KITTI, Cityscape and Make3D datasets demonstrate the superiority of our method, that the proposed method can effectively handle complex scenes with changing brightness and object motions. Both qualitative and quantitative results show that the proposed method outperforms the state-of-the-art methods.
KW - brightness-robust photometric loss
KW - cycle consistency.
KW - selective mask
KW - self-supervised depth estimation
UR - https://www.scopus.com/pages/publications/85101952952
U2 - 10.1145/3394171.3413706
DO - 10.1145/3394171.3413706
M3 - 会议稿件
AN - SCOPUS:85101952952
T3 - MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia
SP - 3108
EP - 3117
BT - MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
T2 - 28th ACM International Conference on Multimedia, MM 2020
Y2 - 12 October 2020 through 16 October 2020
ER -