LoopRefine: Deep Camera Pose Estimation With Loop Consistency

Zhiwei Wang, Hui Deng, Jiawei Shi, Mochu Xiang, Zhicheng Lu, Qi Liu, Yuchao Dai

Research output: Contribution to journalArticlepeer-review

Abstract

Recently, pose estimation under sparse views (≤ 10) has witnessed significant advances with the development of deep learning. Most existing methods directly regress the absolute poses, demonstrating leading performance on benchmarks. However, directly regressing the scaled poses using deep neural networks is inherently ill-posed, resulting in overfitted models that perform poorly on diverse scenarios. In contrast, we resort to the well-posed solutions from traditional Structure-from-Motion (SfM) pipelines and propose LoopRefine, a diffusion model that assumes known camera intrinsics and estimates pairwise normalized camera relative poses and utilizes triplet coplanar constraints to align the scale of camera poses. Like traditional SfM methods, LoopRefine incrementally constructs camera triplets, and the scale ambiguities are resolved by gradually recovering the scale of poses and connecting the pose graph. To further improve the pose estimation accuracy during inference, we explore pose compatibility by randomly chaining the loop transformations on the pose graph and organizing iterative loop consistency-based optimization. Extensive experiments demonstrate the superiority of our method, and the generalization performance on both object-centered datasets and scene datasets also proves the effectiveness of integrated geometric constraints.

Original languageEnglish
JournalIEEE Robotics and Automation Letters
DOIs
StateAccepted/In press - 2025

Keywords

  • Coplanar Constraints
  • Diffusion Model
  • Loop Consistency-based Optimization
  • Pose estimation
  • Sparse Views

Fingerprint

Dive into the research topics of 'LoopRefine: Deep Camera Pose Estimation With Loop Consistency'. Together they form a unique fingerprint.

Cite this