Edge Devices Friendly Self-Supervised Monocular Depth Estimation via Knowledge Distillation

Wei Gao, Di Rao, Yang Yang, Jie Chen

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

Self-supervised monocular depth estimation (MDE) has great potential for deployment in a wide range of applications, including virtual reality, autonomous driving, and robotics. Nevertheless, most previous studies focused on complex architectures to pursue better performance in MDE. In this letter, we aim to develop a lightweight yet highly effective self-supervised MDE model that can deliver competitive performance in edge devices. We introduce a novel MobileViT-based depth (MViTDepth) network that can effectively capture both local features and global information by leveraging the strengths of convolutional neural networks (CNNs) and a vision transformer (ViT). To further compress the proposed MViTDepth model, we employ knowledge distillation, which leads to improved depth estimation performance. Specifically, the self-supervised MDE MonoViT is used as a teacher model to construct the knowledge distillation loss for optimizing a student model. Experimental results on benchmark datasets demonstrate that the proposed MViTDepth significantly outperforms Monodepth2 in terms of parameters and accuracy, thereby indicating its superiority in application to edge devices.

Original languageEnglish
Pages (from-to)8470-8477
Number of pages8
JournalIEEE Robotics and Automation Letters
Volume8
Issue number12
DOIs
StatePublished - 1 Dec 2023

Keywords

  • autonomous vehicle navigation
  • Deep learning for visual perception
  • lightweight monocular depth estimation

Fingerprint

Dive into the research topics of 'Edge Devices Friendly Self-Supervised Monocular Depth Estimation via Knowledge Distillation'. Together they form a unique fingerprint.

Cite this