TY - JOUR
T1 - Lightweight self-supervised monocular depth estimation based on a conditional diffusion model
AU - Gao, Wei
AU - Zhang, Junding
AU - Bakala, Nesmy Parice
AU - Tang, Chengkai
N1 - Publisher Copyright:
© 2026 Elsevier Ltd.
PY - 2026/8/15
Y1 - 2026/8/15
N2 - Self-supervised monocular depth estimation (MDE), which eliminates the reliance on ground-truth depth annotations, has become a key enabling technology for autonomous driving and robotic perception, where both accuracy and computational efficiency are critical for real-time operation. However, most existing approaches emphasize performance gains at the expense of efficiency, while current lightweight methods struggle to achieve an optimal balance between model complexity and estimation accuracy. To address these challenges, we propose LDiffDepth, a lightweight yet effective self-supervised MDE framework built upon a conditional diffusion model. Specifically, LDiffDepth integrates a lightweight noise predictor (LNP) to substantially reduce the computational burden of the diffusion process by enabling efficient noise estimation. Furthermore, an efficient context prior attention (ECPA) module is introduced to enhance depth prediction in weakly textured and structurally complex regions with minimal additional overhead. Comprehensive experiments on multiple benchmark datasets demonstrate that LDiffDepth achieves a superior trade-off between efficiency and accuracy, consistently outperforming existing state-of-the-art lightweight self-supervised MDE methods. The implementation is publicly available at https://github.com/zjdzhou/LDiffDepth .
AB - Self-supervised monocular depth estimation (MDE), which eliminates the reliance on ground-truth depth annotations, has become a key enabling technology for autonomous driving and robotic perception, where both accuracy and computational efficiency are critical for real-time operation. However, most existing approaches emphasize performance gains at the expense of efficiency, while current lightweight methods struggle to achieve an optimal balance between model complexity and estimation accuracy. To address these challenges, we propose LDiffDepth, a lightweight yet effective self-supervised MDE framework built upon a conditional diffusion model. Specifically, LDiffDepth integrates a lightweight noise predictor (LNP) to substantially reduce the computational burden of the diffusion process by enabling efficient noise estimation. Furthermore, an efficient context prior attention (ECPA) module is introduced to enhance depth prediction in weakly textured and structurally complex regions with minimal additional overhead. Comprehensive experiments on multiple benchmark datasets demonstrate that LDiffDepth achieves a superior trade-off between efficiency and accuracy, consistently outperforming existing state-of-the-art lightweight self-supervised MDE methods. The implementation is publicly available at https://github.com/zjdzhou/LDiffDepth .
KW - Conditional diffusion
KW - Lightweight monocular depth estimation
KW - Self-supervised learning
UR - https://www.scopus.com/pages/publications/105038352649
U2 - 10.1016/j.engappai.2026.115067
DO - 10.1016/j.engappai.2026.115067
M3 - 文章
AN - SCOPUS:105038352649
SN - 0952-1976
VL - 178
JO - Engineering Applications of Artificial Intelligence
JF - Engineering Applications of Artificial Intelligence
M1 - 115067
ER -