An Adaptive Viewpoint Transformation Network for 3D Human Pose Estimation

Guoqiang Liang; Xiangping Zhong; Lingyan Ran; Yanning Zhang

doi:10.1109/ACCESS.2020.3013917

An Adaptive Viewpoint Transformation Network for 3D Human Pose Estimation

Guoqiang Liang, Xiangping Zhong, Lingyan Ran, Yanning Zhang

计算机学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

5 引用（Scopus）

摘要

Human pose estimation from a monocular image has attracted lots of interest due to its huge potential application in many areas. The performance of 2D human pose estimation has been improved a lot with the emergence of deep convolutional neural network. In contrast, the recovery of 3D human pose from an 2D pose is still a challenging problem. Currently, most of the methods try to learn a universal map, which can be applied for all human poses in any viewpoints. However, due to the large variety of human poses and camera viewpoints, it is very difficult to learn a such universal mapping from current datasets for 3D pose estimation. Instead of learning a universal map, we propose to learn an adaptive viewpoint transformation module, which transforms the 2D human pose to a more suitable viewpoint for recovering the 3D human pose. Specifically, our transformation module takes a 2D pose as input and predicts the transformation parameters. Rather than some hand-crafted criteria, this module is directly learned from the datasets and depends on the input 2D pose in testing phrase. Then the 3D pose is recovered from this transformed 2D pose. Since the difficulty of 3D pose recovery becomes smaller, we can obtain more accurate estimation results. Experiments on Human3.6M and MPII datasets show that the proposed adaptive viewpoint transformation can improve the performance of 3D human pose estimation.

源语言	英语
文章编号	9154716
页（从-至）	143076-143084
页数	9
期刊	IEEE Access
卷	8
DOI	https://doi.org/10.1109/ACCESS.2020.3013917
出版状态	已出版 - 2020

访问文件

10.1109/ACCESS.2020.3013917

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{c29fd874c45d4d87a265b7d88f3eb6e4,

title = "An Adaptive Viewpoint Transformation Network for 3D Human Pose Estimation",

abstract = "Human pose estimation from a monocular image has attracted lots of interest due to its huge potential application in many areas. The performance of 2D human pose estimation has been improved a lot with the emergence of deep convolutional neural network. In contrast, the recovery of 3D human pose from an 2D pose is still a challenging problem. Currently, most of the methods try to learn a universal map, which can be applied for all human poses in any viewpoints. However, due to the large variety of human poses and camera viewpoints, it is very difficult to learn a such universal mapping from current datasets for 3D pose estimation. Instead of learning a universal map, we propose to learn an adaptive viewpoint transformation module, which transforms the 2D human pose to a more suitable viewpoint for recovering the 3D human pose. Specifically, our transformation module takes a 2D pose as input and predicts the transformation parameters. Rather than some hand-crafted criteria, this module is directly learned from the datasets and depends on the input 2D pose in testing phrase. Then the 3D pose is recovered from this transformed 2D pose. Since the difficulty of 3D pose recovery becomes smaller, we can obtain more accurate estimation results. Experiments on Human3.6M and MPII datasets show that the proposed adaptive viewpoint transformation can improve the performance of 3D human pose estimation.",

keywords = "3D human pose estimation, adaptive viewpoint transformation, deep convolutional neural network",

author = "Guoqiang Liang and Xiangping Zhong and Lingyan Ran and Yanning Zhang",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2020",

doi = "10.1109/ACCESS.2020.3013917",

language = "英语",

volume = "8",

pages = "143076--143084",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - An Adaptive Viewpoint Transformation Network for 3D Human Pose Estimation

AU - Liang, Guoqiang

AU - Zhong, Xiangping

AU - Ran, Lingyan

AU - Zhang, Yanning

PY - 2020

Y1 - 2020

N2 - Human pose estimation from a monocular image has attracted lots of interest due to its huge potential application in many areas. The performance of 2D human pose estimation has been improved a lot with the emergence of deep convolutional neural network. In contrast, the recovery of 3D human pose from an 2D pose is still a challenging problem. Currently, most of the methods try to learn a universal map, which can be applied for all human poses in any viewpoints. However, due to the large variety of human poses and camera viewpoints, it is very difficult to learn a such universal mapping from current datasets for 3D pose estimation. Instead of learning a universal map, we propose to learn an adaptive viewpoint transformation module, which transforms the 2D human pose to a more suitable viewpoint for recovering the 3D human pose. Specifically, our transformation module takes a 2D pose as input and predicts the transformation parameters. Rather than some hand-crafted criteria, this module is directly learned from the datasets and depends on the input 2D pose in testing phrase. Then the 3D pose is recovered from this transformed 2D pose. Since the difficulty of 3D pose recovery becomes smaller, we can obtain more accurate estimation results. Experiments on Human3.6M and MPII datasets show that the proposed adaptive viewpoint transformation can improve the performance of 3D human pose estimation.

AB - Human pose estimation from a monocular image has attracted lots of interest due to its huge potential application in many areas. The performance of 2D human pose estimation has been improved a lot with the emergence of deep convolutional neural network. In contrast, the recovery of 3D human pose from an 2D pose is still a challenging problem. Currently, most of the methods try to learn a universal map, which can be applied for all human poses in any viewpoints. However, due to the large variety of human poses and camera viewpoints, it is very difficult to learn a such universal mapping from current datasets for 3D pose estimation. Instead of learning a universal map, we propose to learn an adaptive viewpoint transformation module, which transforms the 2D human pose to a more suitable viewpoint for recovering the 3D human pose. Specifically, our transformation module takes a 2D pose as input and predicts the transformation parameters. Rather than some hand-crafted criteria, this module is directly learned from the datasets and depends on the input 2D pose in testing phrase. Then the 3D pose is recovered from this transformed 2D pose. Since the difficulty of 3D pose recovery becomes smaller, we can obtain more accurate estimation results. Experiments on Human3.6M and MPII datasets show that the proposed adaptive viewpoint transformation can improve the performance of 3D human pose estimation.

KW - 3D human pose estimation

KW - adaptive viewpoint transformation

KW - deep convolutional neural network

UR - http://www.scopus.com/inward/record.url?scp=85091824454&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2020.3013917

DO - 10.1109/ACCESS.2020.3013917

M3 - 文章

AN - SCOPUS:85091824454

SN - 2169-3536

VL - 8

SP - 143076

EP - 143084

JO - IEEE Access

JF - IEEE Access

M1 - 9154716

ER -

An Adaptive Viewpoint Transformation Network for 3D Human Pose Estimation

摘要

访问文件

其它文件与链接

指纹

引用此