Abstract
Self-supervised monocular depth estimation (MDE), which eliminates the reliance on ground-truth depth annotations, has become a key enabling technology for autonomous driving and robotic perception, where both accuracy and computational efficiency are critical for real-time operation. However, most existing approaches emphasize performance gains at the expense of efficiency, while current lightweight methods struggle to achieve an optimal balance between model complexity and estimation accuracy. To address these challenges, we propose LDiffDepth, a lightweight yet effective self-supervised MDE framework built upon a conditional diffusion model. Specifically, LDiffDepth integrates a lightweight noise predictor (LNP) to substantially reduce the computational burden of the diffusion process by enabling efficient noise estimation. Furthermore, an efficient context prior attention (ECPA) module is introduced to enhance depth prediction in weakly textured and structurally complex regions with minimal additional overhead. Comprehensive experiments on multiple benchmark datasets demonstrate that LDiffDepth achieves a superior trade-off between efficiency and accuracy, consistently outperforming existing state-of-the-art lightweight self-supervised MDE methods. The implementation is publicly available at https://github.com/zjdzhou/LDiffDepth .
| Original language | English |
|---|---|
| Article number | 115067 |
| Journal | Engineering Applications of Artificial Intelligence |
| Volume | 178 |
| DOIs | |
| State | Published - 15 Aug 2026 |
Keywords
- Conditional diffusion
- Lightweight monocular depth estimation
- Self-supervised learning
Fingerprint
Dive into the research topics of 'Lightweight self-supervised monocular depth estimation based on a conditional diffusion model'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver