BEVRefiner: Improving 3D Object Detection in Bird's-Eye-View via Dual Refinement

Binglu Wang, Haowen Zheng, Lei Zhang, Nian Liu, Rao Muhammad Anwer, Hisham Cholakkal, Yongqiang Zhao, Zhijun Li

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

Many multi-view camera-based 3D object detection models transform the image features into Bird's-Eye-View (BEV) via the Lift-Splat-Shoot (LSS) mechanism, which 'lifts' 2D camera-view features to the 3D voxel space based on the predicted depth distribution and then 'splats' 3D features into a BEV plane for subsequent 3D object detection. However, the BEV feature in such a one-stage view transformation scheme heavily relies on the quality of the predicted depth distribution and 2D camera-view features, which further determines the final detection performance. In this paper, we propose a BEVRefiner model which performs dual refinement for both depth prediction and 2D camera-view features. On the one hand, we perform light-weight depth refinement in the depth distribution frustum space by incorporating 3D context and depth distribution prior. On the other hand, we reproject the BEV feature back to each camera view to enhance 2D image features. In this way, the original camera-view features can be enhanced by implicitly incorporating 3D contexts and multi-view contexts, which cannot be achieved in the original 2D camera view. We also propose to use dominant depth bins only for the reprojection to save computational burden. Finally, we generate the refined BEV feature using the refined depth distribution and camera-view features for more accurate 3D object detection. Our BEVRefiner can be plugged into LSS-based BEV detectors and we perform extensive experiments on the representative model BEVDet, which strongly verified the efficiency of our proposed approach under several settings.

Original languageEnglish
Pages (from-to)15094-15105
Number of pages12
JournalIEEE Transactions on Intelligent Transportation Systems
Volume25
Issue number10
DOIs
StatePublished - 2024

Keywords

  • 3D object detection
  • BEV
  • depth prediction
  • refinement

Fingerprint

Dive into the research topics of 'BEVRefiner: Improving 3D Object Detection in Bird's-Eye-View via Dual Refinement'. Together they form a unique fingerprint.

Cite this