TY - GEN
T1 - PoseFlow
T2 - 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
AU - Zhang, Dingwen
AU - Guo, Guangyu
AU - Huang, Dong
AU - Han, Junwei
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/14
Y1 - 2018/12/14
N2 - Motion of the human body is the critical cue for understanding and characterizing human behavior in videos. Most existing approaches explore the motion cue using optical flows. However, optical flow usually contains motion on both the interested human bodies and the undesired background. This 'noisy' motion representation makes it very challenging for pose estimation and action recognition in real scenarios. To address this issue, this paper presents a novel deep motion representation, called PoseFlow, which reveals human motion in videos while suppressing background and motion blur, and being robust to occlusion. For learning PoseFlow with mild computational cost, we propose a functionally structured spatial-temporal deep network, PoseFlow Net (PFN), to jointly solve the skeleton localization and matching problems of PoseFlow. Comprehensive experiments show that PFN outperforms the state-of-the-art deep flow estimation models in generating PoseFlow. Moreover, PoseFlow demonstrates its potential on improving two challenging tasks in human video analysis: Pose estimation and action recognition.
AB - Motion of the human body is the critical cue for understanding and characterizing human behavior in videos. Most existing approaches explore the motion cue using optical flows. However, optical flow usually contains motion on both the interested human bodies and the undesired background. This 'noisy' motion representation makes it very challenging for pose estimation and action recognition in real scenarios. To address this issue, this paper presents a novel deep motion representation, called PoseFlow, which reveals human motion in videos while suppressing background and motion blur, and being robust to occlusion. For learning PoseFlow with mild computational cost, we propose a functionally structured spatial-temporal deep network, PoseFlow Net (PFN), to jointly solve the skeleton localization and matching problems of PoseFlow. Comprehensive experiments show that PFN outperforms the state-of-the-art deep flow estimation models in generating PoseFlow. Moreover, PoseFlow demonstrates its potential on improving two challenging tasks in human video analysis: Pose estimation and action recognition.
UR - http://www.scopus.com/inward/record.url?scp=85052978702&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2018.00707
DO - 10.1109/CVPR.2018.00707
M3 - 会议稿件
AN - SCOPUS:85052978702
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 6762
EP - 6770
BT - Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
PB - IEEE Computer Society
Y2 - 18 June 2018 through 22 June 2018
ER -