Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN

Bo Li; Yuchao Dai; Xuelian Cheng; Huahui Chen; Yi Lin; Mingyi He

doi:10.1109/ICMEW.2017.8026282

Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN

Bo Li, Yuchao Dai, Xuelian Cheng, Huahui Chen, Yi Lin, Mingyi He

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

252 Scopus citations

Abstract

We present an image classification based approach to large scale action recognition from 3D skeleton videos. Firstly, we map the 3D skeleton videos to color images, where the transformed action images are translation-scale invariance and dataset independent. Secondly, we propose a multi-scale deep convolutional neural network (CNN) for the image classification task, which could enhance the temporal frequency adjustment of our model. Even though the action images are very different from natural images, the fine-tune strategy still works well. Finally, we exploit various kinds of data augmentation methods to improve the generalization ability of the network. Experimental results on the largest and most challenging benchmark NTU RGB-D dataset show that our method achieves the state-of-the-art performance and outperforms other methods by a large margin.

Original language	English
Title of host publication	2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	601-604
Number of pages	4
ISBN (Electronic)	9781538605608
DOIs	https://doi.org/10.1109/ICMEW.2017.8026282
State	Published - 5 Sep 2017
Externally published	Yes
Event	2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017 - Hong Kong, Hong Kong Duration: 10 Jul 2017 → 14 Jul 2017

Publication series

Name	2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017

Conference

Conference	2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017
Country/Territory	Hong Kong
City	Hong Kong
Period	10/07/17 → 14/07/17

Keywords

3D skeleton
CNN
action recognition
image mapping

Access to Document

10.1109/ICMEW.2017.8026282

Cite this

Li, B., Dai, Y., Cheng, X., Chen, H., Lin, Y., & He, M. (2017). Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN. In 2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017 (pp. 601-604). Article 8026282 (2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICMEW.2017.8026282

Li, Bo ; Dai, Yuchao ; Cheng, Xuelian et al. / Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN. 2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 601-604 (2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017).

@inproceedings{42d3372487ba43cda832d7ad179d2922,

title = "Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN",

abstract = "We present an image classification based approach to large scale action recognition from 3D skeleton videos. Firstly, we map the 3D skeleton videos to color images, where the transformed action images are translation-scale invariance and dataset independent. Secondly, we propose a multi-scale deep convolutional neural network (CNN) for the image classification task, which could enhance the temporal frequency adjustment of our model. Even though the action images are very different from natural images, the fine-tune strategy still works well. Finally, we exploit various kinds of data augmentation methods to improve the generalization ability of the network. Experimental results on the largest and most challenging benchmark NTU RGB-D dataset show that our method achieves the state-of-the-art performance and outperforms other methods by a large margin.",

keywords = "3D skeleton, CNN, action recognition, image mapping",

author = "Bo Li and Yuchao Dai and Xuelian Cheng and Huahui Chen and Yi Lin and Mingyi He",

note = "Publisher Copyright: {\textcopyright} 2017 IEEE.; 2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017 ; Conference date: 10-07-2017 Through 14-07-2017",

year = "2017",

month = sep,

day = "5",

doi = "10.1109/ICMEW.2017.8026282",

language = "英语",

series = "2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "601--604",

booktitle = "2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017",

}

Li, B, Dai, Y, Cheng, X, Chen, H, Lin, Y & He, M 2017, Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN. in 2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017., 8026282, 2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017, Institute of Electrical and Electronics Engineers Inc., pp. 601-604, 2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017, Hong Kong, Hong Kong, 10/07/17. https://doi.org/10.1109/ICMEW.2017.8026282

Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN. / Li, Bo; Dai, Yuchao; Cheng, Xuelian et al.
2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017. Institute of Electrical and Electronics Engineers Inc., 2017. p. 601-604 8026282 (2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN

AU - Li, Bo

AU - Dai, Yuchao

AU - Cheng, Xuelian

AU - Chen, Huahui

AU - Lin, Yi

AU - He, Mingyi

PY - 2017/9/5

Y1 - 2017/9/5

N2 - We present an image classification based approach to large scale action recognition from 3D skeleton videos. Firstly, we map the 3D skeleton videos to color images, where the transformed action images are translation-scale invariance and dataset independent. Secondly, we propose a multi-scale deep convolutional neural network (CNN) for the image classification task, which could enhance the temporal frequency adjustment of our model. Even though the action images are very different from natural images, the fine-tune strategy still works well. Finally, we exploit various kinds of data augmentation methods to improve the generalization ability of the network. Experimental results on the largest and most challenging benchmark NTU RGB-D dataset show that our method achieves the state-of-the-art performance and outperforms other methods by a large margin.

AB - We present an image classification based approach to large scale action recognition from 3D skeleton videos. Firstly, we map the 3D skeleton videos to color images, where the transformed action images are translation-scale invariance and dataset independent. Secondly, we propose a multi-scale deep convolutional neural network (CNN) for the image classification task, which could enhance the temporal frequency adjustment of our model. Even though the action images are very different from natural images, the fine-tune strategy still works well. Finally, we exploit various kinds of data augmentation methods to improve the generalization ability of the network. Experimental results on the largest and most challenging benchmark NTU RGB-D dataset show that our method achieves the state-of-the-art performance and outperforms other methods by a large margin.

KW - 3D skeleton

KW - CNN

KW - action recognition

KW - image mapping

UR - http://www.scopus.com/inward/record.url?scp=85031694994&partnerID=8YFLogxK

U2 - 10.1109/ICMEW.2017.8026282

DO - 10.1109/ICMEW.2017.8026282

M3 - 会议稿件

AN - SCOPUS:85031694994

T3 - 2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017

SP - 601

EP - 604

BT - 2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017

Y2 - 10 July 2017 through 14 July 2017

ER -

Li B, Dai Y, Cheng X, Chen H, Lin Y, He M. Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN. In 2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017. Institute of Electrical and Electronics Engineers Inc. 2017. p. 601-604. 8026282. (2017 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2017). doi: 10.1109/ICMEW.2017.8026282

Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this