Disentangling Deep Network for Reconstructing 3D Object Shapes from Single 2D Images

Yang Yang, Junwei Han, Dingwen Zhang, De Cheng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Recovering 3D shapes of deformable objects from single 2D images is an extremely challenging and ill-posed problem. Most existing approaches are based on structure-from-motion or graph inference, where a 3D shape is solved by fitting 2D keypoints/mask instead of directly using the vital cue in the original 2D image. These methods usually require multiple views of an object instance and rely on accurate labeling, detection, and matching of 2D keypoints/mask across multiple images. To overcome these limitations, we make effort to reconstruct 3D deformable object shapes directly from the given unconstrained 2D images. In training, instead of using multiple images per object instance, our approach relaxes the constraint to use images from the same object category (with one 2D image per object instance). The key is to disentangle the category-specific representation of the 3D shape identity and the instance-specific representation of the 3D shape displacement from the 2D training images. In testing, the 3D shape of an object can be reconstructed from the given image by deforming the 3D shape identity according to the 3D shape displacement. To achieve this goal, we propose a novel convolutional encoder-decoder network—the Disentangling Deep Network (DisDN). To demonstrate the effectiveness of the proposed approach, we implement comprehensive experiments on the challenging PASCAL VOC benchmark and use different 3D shape ground-truth in training and testing to avoiding overfitting. The obtained experimental results show that DisDN outperforms other state-of-the-art and baseline methods.

Original languageEnglish
Title of host publicationPattern Recognition and Computer Vision - 4th Chinese Conference, PRCV 2021, Proceedings
EditorsHuimin Ma, Liang Wang, Changshui Zhang, Fei Wu, Tieniu Tan, Yaonan Wang, Jianhuang Lai, Yao Zhao
PublisherSpringer Science and Business Media Deutschland GmbH
Pages153-166
Number of pages14
ISBN (Print)9783030880064
DOIs
StatePublished - 2021
Event4th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2021 - Beijing, China
Duration: 29 Oct 20211 Nov 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13020 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference4th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2021
Country/TerritoryChina
CityBeijing
Period29/10/211/11/21

Keywords

  • 3D shape reconstruction
  • Disentangling deep network
  • Point cloud

Fingerprint

Dive into the research topics of 'Disentangling Deep Network for Reconstructing 3D Object Shapes from Single 2D Images'. Together they form a unique fingerprint.

Cite this