跳到主要导航 跳到搜索 跳到主要内容

Semantic-Guided Multiview Stereo Reconstruction for Aerial Image

  • Wei Zhang
  • , Zhigang Yang
  • , Qiang Li
  • , Qi Wang
  • Northwestern Polytechnical University Xian

科研成果: 期刊稿件文章同行评审

2 引用 (Scopus)

摘要

The application of learning-based multiview stereo (MVS) depth estimation methods has achieved significant results in large-scale 3-D reconstruction benchmarks. However, adjacent terrains in the aerial image interfere with depth estimation along building edges during the matching process, leading to inaccurate results. To address these challenges, we propose a new end-to-end MVS network, named FuS-MVSNet, which fuses monocular depth probability as a semantic guidance into the multiview geometry-based MVS framework. By combining the strengths of geometric consistency and local semantics, the FuS-MVSNet achieves notable enhancements in both accuracy and robustness. Specifically, we first construct a monocular branch based on the pretrained Depth Anything model to perform monocular metric depth estimation. The nonshared parameters ensure that the depth estimation process is independent of the multiview branch, focusing exclusively on semantic depth inference. Subsequently, to incorporate monocular features into the multiview network, we introduce a volume adaptive fusion module, which adaptively integrates monocular feature volumes into the standard cost volume via an attention mechanism and guides the cost volume regularization. Finally, confidence-based dynamic selection between the two prediction branches ensures the selection of the more robust branch result under challenging conditions. Qualitative and quantitative results indicate that we achieve competitive performance on multiple benchmarks, including the WHU and LuoJia-MVS datasets.

源语言英语
文章编号5630611
期刊IEEE Transactions on Geoscience and Remote Sensing
63
DOI
出版状态已出版 - 2025

指纹

探究 'Semantic-Guided Multiview Stereo Reconstruction for Aerial Image' 的科研主题。它们共同构成独一无二的指纹。

引用此