Self-Supervised Monocular Depth Estimation With Frequency-Based Recurrent Refinement

Rui Li; Danna Xue; Yu Zhu; Hao Wu; Jinqiu Sun; Yanning Zhang

doi:10.1109/TMM.2022.3197367

Self-Supervised Monocular Depth Estimation With Frequency-Based Recurrent Refinement

Rui Li, Danna Xue, Yu Zhu, Hao Wu, Jinqiu Sun, Yanning Zhang

School of Computer Science

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

12 Scopus citations

Abstract

Self-supervised monocular depth estimation has succeeded in learning scene geometry from only image pairs or sequences. However, it is still highly ill-posed for self-supervised depth estimation to generate high-quality depth maps with both global high accuracy and local fine details. To address this issue, we propose a novel frequency-based recurrent refinement scheme to improve the self-supervised depth estimation. Since the global and local depth representation can be correlated to high/low frequency coefficients in the frequency domain, we propose a frequency-based recurrent depth coefficient refinement (RDCR) scheme, which progressively refines both low frequency and high frequency depth coefficients with an RNN-based architecture in a multi-level manner. During the recurrent process, the depth coefficients generated from the previous time step are used as the input to generate the current depth coefficients, yielding progressively optimized depth estimations. Meanwhile, considering that the depth details often appear in areas with high image frequency, we further improve depth details during the RDCR process by leveraging the image-based high frequency components. Specifically, in each RDCR module, we enhance the high frequency depth representations by selecting and feeding the informative image-based high frequency features with a learned feature weighting mask. Extensive experiments show that the proposed method achieves globally accurate estimation with fine local details, outperforming other self-supervised methods in both quantitative and qualitative comparisons.

Original language	English
Pages (from-to)	5626-5637
Number of pages	12
Journal	IEEE Transactions on Multimedia
Volume	25
DOIs	https://doi.org/10.1109/TMM.2022.3197367
State	Published - 2023

Keywords

image-based depth enhancement
recurrent depth coefficient refinement
Self-supervised depth estimation
wavelet

Access to Document

10.1109/TMM.2022.3197367

Cite this

@article{75588516d5f6472aae7539836fb2c553,

title = "Self-Supervised Monocular Depth Estimation With Frequency-Based Recurrent Refinement",

abstract = "Self-supervised monocular depth estimation has succeeded in learning scene geometry from only image pairs or sequences. However, it is still highly ill-posed for self-supervised depth estimation to generate high-quality depth maps with both global high accuracy and local fine details. To address this issue, we propose a novel frequency-based recurrent refinement scheme to improve the self-supervised depth estimation. Since the global and local depth representation can be correlated to high/low frequency coefficients in the frequency domain, we propose a frequency-based recurrent depth coefficient refinement (RDCR) scheme, which progressively refines both low frequency and high frequency depth coefficients with an RNN-based architecture in a multi-level manner. During the recurrent process, the depth coefficients generated from the previous time step are used as the input to generate the current depth coefficients, yielding progressively optimized depth estimations. Meanwhile, considering that the depth details often appear in areas with high image frequency, we further improve depth details during the RDCR process by leveraging the image-based high frequency components. Specifically, in each RDCR module, we enhance the high frequency depth representations by selecting and feeding the informative image-based high frequency features with a learned feature weighting mask. Extensive experiments show that the proposed method achieves globally accurate estimation with fine local details, outperforming other self-supervised methods in both quantitative and qualitative comparisons.",

keywords = "image-based depth enhancement, recurrent depth coefficient refinement, Self-supervised depth estimation, wavelet",

author = "Rui Li and Danna Xue and Yu Zhu and Hao Wu and Jinqiu Sun and Yanning Zhang",

note = "Publisher Copyright: {\textcopyright} 2022 IEEE.",

year = "2023",

doi = "10.1109/TMM.2022.3197367",

language = "英语",

volume = "25",

pages = "5626--5637",

journal = "IEEE Transactions on Multimedia",

issn = "1520-9210",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Self-Supervised Monocular Depth Estimation With Frequency-Based Recurrent Refinement

AU - Li, Rui

AU - Xue, Danna

AU - Zhu, Yu

AU - Wu, Hao

AU - Sun, Jinqiu

AU - Zhang, Yanning

PY - 2023

Y1 - 2023

N2 - Self-supervised monocular depth estimation has succeeded in learning scene geometry from only image pairs or sequences. However, it is still highly ill-posed for self-supervised depth estimation to generate high-quality depth maps with both global high accuracy and local fine details. To address this issue, we propose a novel frequency-based recurrent refinement scheme to improve the self-supervised depth estimation. Since the global and local depth representation can be correlated to high/low frequency coefficients in the frequency domain, we propose a frequency-based recurrent depth coefficient refinement (RDCR) scheme, which progressively refines both low frequency and high frequency depth coefficients with an RNN-based architecture in a multi-level manner. During the recurrent process, the depth coefficients generated from the previous time step are used as the input to generate the current depth coefficients, yielding progressively optimized depth estimations. Meanwhile, considering that the depth details often appear in areas with high image frequency, we further improve depth details during the RDCR process by leveraging the image-based high frequency components. Specifically, in each RDCR module, we enhance the high frequency depth representations by selecting and feeding the informative image-based high frequency features with a learned feature weighting mask. Extensive experiments show that the proposed method achieves globally accurate estimation with fine local details, outperforming other self-supervised methods in both quantitative and qualitative comparisons.

AB - Self-supervised monocular depth estimation has succeeded in learning scene geometry from only image pairs or sequences. However, it is still highly ill-posed for self-supervised depth estimation to generate high-quality depth maps with both global high accuracy and local fine details. To address this issue, we propose a novel frequency-based recurrent refinement scheme to improve the self-supervised depth estimation. Since the global and local depth representation can be correlated to high/low frequency coefficients in the frequency domain, we propose a frequency-based recurrent depth coefficient refinement (RDCR) scheme, which progressively refines both low frequency and high frequency depth coefficients with an RNN-based architecture in a multi-level manner. During the recurrent process, the depth coefficients generated from the previous time step are used as the input to generate the current depth coefficients, yielding progressively optimized depth estimations. Meanwhile, considering that the depth details often appear in areas with high image frequency, we further improve depth details during the RDCR process by leveraging the image-based high frequency components. Specifically, in each RDCR module, we enhance the high frequency depth representations by selecting and feeding the informative image-based high frequency features with a learned feature weighting mask. Extensive experiments show that the proposed method achieves globally accurate estimation with fine local details, outperforming other self-supervised methods in both quantitative and qualitative comparisons.

KW - image-based depth enhancement

KW - recurrent depth coefficient refinement

KW - Self-supervised depth estimation

KW - wavelet

UR - http://www.scopus.com/inward/record.url?scp=85136031190&partnerID=8YFLogxK

U2 - 10.1109/TMM.2022.3197367

DO - 10.1109/TMM.2022.3197367

M3 - 文章

AN - SCOPUS:85136031190

SN - 1520-9210

VL - 25

SP - 5626

EP - 5637

JO - IEEE Transactions on Multimedia

JF - IEEE Transactions on Multimedia

ER -

Self-Supervised Monocular Depth Estimation With Frequency-Based Recurrent Refinement

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this