HandFormer: Hand pose reconstructing from a single RGB image

Zixun Jiao, Xihan Wang, Jingcao Li, Rongxin Gao, Miao He, Jiao Liang, Zhaoqiang Xia, Quanli Gao

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

We propose a multi-task progressive Transformer framework to reconstruct hand poses from a single RGB image to address challenges such as hand occlusion hand distraction, and hand shape bias. Our proposed framework comprises three key components: the feature extraction branch, palm segmentation branch, and parameter prediction branch. The feature extraction branch initially employs the progressive Transformer to extract multi-scale features from the input image. Subsequently, these multi-scale features are fed into a multi-layer perceptron layer (MLP) for acquiring palm alignment features. We employ an efficient fusion module to enhance the parameter prediction further features to integrate the palm alignment features with the backbone features. A dense hand model is generated using a pre-computed articulated mesh deformed hand model. We evaluate the performance of our proposed method on STEREO, FreiHAND, and HO3D datasets separately. The experimental results demonstrate that our approach achieves 3D mean error metrics of 10.92 mm, 12.33 mm and 9.6 mm for the respective datasets.

Original languageEnglish
Pages (from-to)155-164
Number of pages10
JournalPattern Recognition Letters
Volume183
DOIs
StatePublished - Jul 2024

Keywords

  • Hand attitude estimation
  • Hand attitude estimation and segmentation
  • Multi-scale features
  • Multitask progressive transformer framework
  • Multitasking learning

Fingerprint

Dive into the research topics of 'HandFormer: Hand pose reconstructing from a single RGB image'. Together they form a unique fingerprint.

Cite this