Solving Monocular Sensors Depth Prediction Using MLP-Based Architecture and Multi-Scale Inverse Attention

Zeyu Cheng, Yi Zhang, Chengkai Tang

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Monocular sensors depth prediction has received continuous attention in recent years because of its wide application in autonomous driving, intelligent system navigation and other fields. Convolutional neural networks have dominated monocular depth prediction for a long time, and the recent introduction of Transformer-based and MLP-based architectures in the field of computer vision has provided some new ideas for monocular depth prediction. However, they all have a series of problems such as high computational complexity and excessive parameters. In this paper, we propose MLP-Depth, which is a lightweight monocular depth prediction method based on hierarchical multi-stage MLP, and utilizes depth-wise convolution to improve local modeling capabilities and reduce parameters and computational costs. In addition, we also design a multi-scale inverse attention mechanism to implicitly improve the global expressiveness of MLP-Depth. Our method effectively reduces the number of parameters of monocular depth prediction network using transformer-like architectures, and extensive experiments show that MLP-Depth can achieve competitive results with fewer parameters in challenging outdoor and indoor datasets.

Original languageEnglish
Pages (from-to)16178-16189
Number of pages12
JournalIEEE Sensors Journal
Volume22
Issue number16
DOIs
StatePublished - 15 Aug 2022

Keywords

  • Hierarchical multi-stage MLP
  • Monocular sensors depth prediction
  • Multi-scale inverse attention

Fingerprint

Dive into the research topics of 'Solving Monocular Sensors Depth Prediction Using MLP-Based Architecture and Multi-Scale Inverse Attention'. Together they form a unique fingerprint.

Cite this