ETV-MVS: Robust Visibility-Aware Multi-View Stereo with Epipolar Line-Based Transformer

Shaoqian Wang, Xiaokun Ding, Yuxin Mao, Yuchao Dai

Research output: Contribution to journalArticlepeer-review

Abstract

Multi-View Stereo (MVS) is a pivotal technique in computer vision for reconstructing 3D models frommultiple images by estimating depth maps. However, the reconstruction performance is hindered by visibilitychallenges, such as occlusions and non-overlapping regions. In this paper, we propose an innovative visibility-aware framework to address these issues. Central to our method is an Epipolar Line-based Transformer (ELT)module, which capitalizes on the epipolar line correspondence and candidate matching features betweenimages to enhance the feature representation and correlation robustness. Furthermore, we propose a novelSupervised Visibility Estimation (SVE) module that estimates high-precision visibility maps, transcending theconstraints of previous methods that rely on indirect supervision. By integrating these modules, our methodachieves state-of-the-art results on the benchmarks and demonstrates its capability to perform high-qualityreconstructions even in challenging regions.

Original languageEnglish
Pages (from-to)520-533
Number of pages14
JournalBig Data Mining and Analytics
Volume8
Issue number3
DOIs
StatePublished - 2025

Keywords

  • Deep Neural Networks (DNN)
  • epipolar geometry
  • Multi-View Stereo (MVS)
  • Transformer

Cite this