基于多视图传播的无监督三维重建方法

Jingfeng Luo; Dongli Yuan; Lan Zhang; Yaohong Qu; Shihong Su

doi:10.1051/jnwpu/20244210129

基于多视图传播的无监督三维重建方法

Translated title of the contribution: Unsupervised 3D reconstruction method based on multi-view propagation

Jingfeng Luo, Dongli Yuan, Lan Zhang, Yaohong Qu, Shihong Su

School of Automation

Research output: Contribution to journal › Article › peer-review

Abstract

In this paper, an end-to-end deep learning framework for reconstructing 3D models by computing depth maps from multiple views is proposed. An unsupervised 3D reconstruction method based on multi-view propagation is introduced, which addresses the issues of large GPU memory consumption caused by most current research methods using 3D convolution for 3D cost volume regularization and regression to obtain the initial depth map, as well as the difficulty in obtaining true depth values in supervised methods due to device limitations. The method is inspired by the Patchmatch algorithm, and the depth is divided into n layers within the depth range to obtain depth hypotheses through multi-view propagation. What's more, a multi-metric loss function is constructed based on luminosity consistency, structural similarity, and depth smoothness between multiple views to serve as a supervisory signal for learning depth predictions in the network. The experimental results show our proposed method has a very competitive performance and generalization on the DTU, Tanks & Temples and our self-made dataset; Specifically, it is at least 1.7 times faster and requires more than 75% less memory than the method that utilizes 3D cost volume regularization.

Translated title of the contribution	Unsupervised 3D reconstruction method based on multi-view propagation
Original language	Chinese (Traditional)
Pages (from-to)	129-137
Number of pages	9
Journal	Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University
Volume	42
Issue number	1
DOIs	https://doi.org/10.1051/jnwpu/20244210129
State	Published - Feb 2024

Access to Document

10.1051/jnwpu/20244210129

Cite this

@article{496743f0f95f43e8a8182ceda0c7b49c,

title = "基于多视图传播的无监督三维重建方法",

abstract = "In this paper, an end-to-end deep learning framework for reconstructing 3D models by computing depth maps from multiple views is proposed. An unsupervised 3D reconstruction method based on multi-view propagation is introduced, which addresses the issues of large GPU memory consumption caused by most current research methods using 3D convolution for 3D cost volume regularization and regression to obtain the initial depth map, as well as the difficulty in obtaining true depth values in supervised methods due to device limitations. The method is inspired by the Patchmatch algorithm, and the depth is divided into n layers within the depth range to obtain depth hypotheses through multi-view propagation. What's more, a multi-metric loss function is constructed based on luminosity consistency, structural similarity, and depth smoothness between multiple views to serve as a supervisory signal for learning depth predictions in the network. The experimental results show our proposed method has a very competitive performance and generalization on the DTU, Tanks & Temples and our self-made dataset; Specifically, it is at least 1.7 times faster and requires more than 75% less memory than the method that utilizes 3D cost volume regularization.",

keywords = "3D reconstruction, multi-metric loss function, multi-view propagation, Patchmatch algorithm, unsupervised",

author = "Jingfeng Luo and Dongli Yuan and Lan Zhang and Yaohong Qu and Shihong Su",

note = "Publisher Copyright: {\textcopyright}2024 Journal of Northwestern Polytechnical University.",

year = "2024",

month = feb,

doi = "10.1051/jnwpu/20244210129",

language = "繁体中文",

volume = "42",

pages = "129--137",

journal = "Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University",

issn = "1000-2758",

publisher = "Northwestern Polytechnical University",

number = "1",

}

TY - JOUR

T1 - 基于多视图传播的无监督三维重建方法

AU - Luo, Jingfeng

AU - Yuan, Dongli

AU - Zhang, Lan

AU - Qu, Yaohong

AU - Su, Shihong

PY - 2024/2

Y1 - 2024/2

N2 - In this paper, an end-to-end deep learning framework for reconstructing 3D models by computing depth maps from multiple views is proposed. An unsupervised 3D reconstruction method based on multi-view propagation is introduced, which addresses the issues of large GPU memory consumption caused by most current research methods using 3D convolution for 3D cost volume regularization and regression to obtain the initial depth map, as well as the difficulty in obtaining true depth values in supervised methods due to device limitations. The method is inspired by the Patchmatch algorithm, and the depth is divided into n layers within the depth range to obtain depth hypotheses through multi-view propagation. What's more, a multi-metric loss function is constructed based on luminosity consistency, structural similarity, and depth smoothness between multiple views to serve as a supervisory signal for learning depth predictions in the network. The experimental results show our proposed method has a very competitive performance and generalization on the DTU, Tanks & Temples and our self-made dataset; Specifically, it is at least 1.7 times faster and requires more than 75% less memory than the method that utilizes 3D cost volume regularization.

AB - In this paper, an end-to-end deep learning framework for reconstructing 3D models by computing depth maps from multiple views is proposed. An unsupervised 3D reconstruction method based on multi-view propagation is introduced, which addresses the issues of large GPU memory consumption caused by most current research methods using 3D convolution for 3D cost volume regularization and regression to obtain the initial depth map, as well as the difficulty in obtaining true depth values in supervised methods due to device limitations. The method is inspired by the Patchmatch algorithm, and the depth is divided into n layers within the depth range to obtain depth hypotheses through multi-view propagation. What's more, a multi-metric loss function is constructed based on luminosity consistency, structural similarity, and depth smoothness between multiple views to serve as a supervisory signal for learning depth predictions in the network. The experimental results show our proposed method has a very competitive performance and generalization on the DTU, Tanks & Temples and our self-made dataset; Specifically, it is at least 1.7 times faster and requires more than 75% less memory than the method that utilizes 3D cost volume regularization.

KW - 3D reconstruction

KW - multi-metric loss function

KW - multi-view propagation

KW - Patchmatch algorithm

KW - unsupervised

UR - http://www.scopus.com/inward/record.url?scp=85188860685&partnerID=8YFLogxK

U2 - 10.1051/jnwpu/20244210129

DO - 10.1051/jnwpu/20244210129

M3 - 文章

AN - SCOPUS:85188860685

SN - 1000-2758

VL - 42

SP - 129

EP - 137

JO - Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University

JF - Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University

IS - 1

ER -

基于多视图传播的无监督三维重建方法

Abstract

Access to Document

Other files and links

Fingerprint

Cite this