TY - GEN
T1 - Disentangling Deep Network for Reconstructing 3D Object Shapes from Single 2D Images
AU - Yang, Yang
AU - Han, Junwei
AU - Zhang, Dingwen
AU - Cheng, De
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - Recovering 3D shapes of deformable objects from single 2D images is an extremely challenging and ill-posed problem. Most existing approaches are based on structure-from-motion or graph inference, where a 3D shape is solved by fitting 2D keypoints/mask instead of directly using the vital cue in the original 2D image. These methods usually require multiple views of an object instance and rely on accurate labeling, detection, and matching of 2D keypoints/mask across multiple images. To overcome these limitations, we make effort to reconstruct 3D deformable object shapes directly from the given unconstrained 2D images. In training, instead of using multiple images per object instance, our approach relaxes the constraint to use images from the same object category (with one 2D image per object instance). The key is to disentangle the category-specific representation of the 3D shape identity and the instance-specific representation of the 3D shape displacement from the 2D training images. In testing, the 3D shape of an object can be reconstructed from the given image by deforming the 3D shape identity according to the 3D shape displacement. To achieve this goal, we propose a novel convolutional encoder-decoder network—the Disentangling Deep Network (DisDN). To demonstrate the effectiveness of the proposed approach, we implement comprehensive experiments on the challenging PASCAL VOC benchmark and use different 3D shape ground-truth in training and testing to avoiding overfitting. The obtained experimental results show that DisDN outperforms other state-of-the-art and baseline methods.
AB - Recovering 3D shapes of deformable objects from single 2D images is an extremely challenging and ill-posed problem. Most existing approaches are based on structure-from-motion or graph inference, where a 3D shape is solved by fitting 2D keypoints/mask instead of directly using the vital cue in the original 2D image. These methods usually require multiple views of an object instance and rely on accurate labeling, detection, and matching of 2D keypoints/mask across multiple images. To overcome these limitations, we make effort to reconstruct 3D deformable object shapes directly from the given unconstrained 2D images. In training, instead of using multiple images per object instance, our approach relaxes the constraint to use images from the same object category (with one 2D image per object instance). The key is to disentangle the category-specific representation of the 3D shape identity and the instance-specific representation of the 3D shape displacement from the 2D training images. In testing, the 3D shape of an object can be reconstructed from the given image by deforming the 3D shape identity according to the 3D shape displacement. To achieve this goal, we propose a novel convolutional encoder-decoder network—the Disentangling Deep Network (DisDN). To demonstrate the effectiveness of the proposed approach, we implement comprehensive experiments on the challenging PASCAL VOC benchmark and use different 3D shape ground-truth in training and testing to avoiding overfitting. The obtained experimental results show that DisDN outperforms other state-of-the-art and baseline methods.
KW - 3D shape reconstruction
KW - Disentangling deep network
KW - Point cloud
UR - http://www.scopus.com/inward/record.url?scp=85118222936&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-88007-1_13
DO - 10.1007/978-3-030-88007-1_13
M3 - 会议稿件
AN - SCOPUS:85118222936
SN - 9783030880064
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 153
EP - 166
BT - Pattern Recognition and Computer Vision - 4th Chinese Conference, PRCV 2021, Proceedings
A2 - Ma, Huimin
A2 - Wang, Liang
A2 - Zhang, Changshui
A2 - Wu, Fei
A2 - Tan, Tieniu
A2 - Wang, Yaonan
A2 - Lai, Jianhuang
A2 - Zhao, Yao
PB - Springer Science and Business Media Deutschland GmbH
T2 - 4th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2021
Y2 - 29 October 2021 through 1 November 2021
ER -