Learning shape-motion representations from geometric algebra spatio-temporal model for skeleton-based action recognition

Yanshan Li; Rongjie Xia; Xing Liu; Qinghua Huang

doi:10.1109/ICME.2019.00187

Learning shape-motion representations from geometric algebra spatio-temporal model for skeleton-based action recognition

Yanshan Li, Rongjie Xia, Xing Liu, Qinghua Huang

School of Artificial Intelligence, OPtics and Electronics

Shenzhen University

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

59 Scopus citations

Abstract

Skeleton-based action recognition has been widely applied in intelligent video surveillance and human behavior analysis. Previous works have successfully applied Convolutional Neural Networks (CNN) to learn spatio-temporal characteristics of the skeleton sequence. However, they merely focus on the coordinates of isolated joints, which ignore the spatial relationships between joints and only implicitly learn the motion representations. To solve these problems, we propose an effective method to learn comprehensive representations from skeleton sequences by using Geometric Algebra. Firstly, a frontal orientation based spatio-temporal model is constructed to represent the spatial configuration and temporal dynamics of skeleton sequences, which owns the robustness against view variations. Then the shape-motion representations which mutually compensate are learned to describe skeleton actions comprehensively. Finally, a multi-stream CNN model is applied to extract and fuse deep features from the complementary shape-motion representations. Experimental results on NTU RGB+D and Northwestern-UCLA datasets consistently verify the superiority of our method.

Original language	English
Title of host publication	Proceedings - 2019 IEEE International Conference on Multimedia and Expo, ICME 2019
Publisher	IEEE Computer Society
Pages	1066-1071
Number of pages	6
ISBN (Electronic)	9781538695524
DOIs	https://doi.org/10.1109/ICME.2019.00187
State	Published - Jul 2019
Event	2019 IEEE International Conference on Multimedia and Expo, ICME 2019 - Shanghai, China Duration: 8 Jul 2019 → 12 Jul 2019

Publication series

Name	Proceedings - IEEE International Conference on Multimedia and Expo
Volume	2019-July
ISSN (Print)	1945-7871
ISSN (Electronic)	1945-788X

Conference

Conference	2019 IEEE International Conference on Multimedia and Expo, ICME 2019
Country/Territory	China
City	Shanghai
Period	8/07/19 → 12/07/19

Keywords

Geometric algebra
Human action recognition
Skeleton sequence
Spatio-temporal model

Access to Document

10.1109/ICME.2019.00187

Cite this

Li, Y., Xia, R., Liu, X., & Huang, Q. (2019). Learning shape-motion representations from geometric algebra spatio-temporal model for skeleton-based action recognition. In Proceedings - 2019 IEEE International Conference on Multimedia and Expo, ICME 2019 (pp. 1066-1071). Article 8785009 (Proceedings - IEEE International Conference on Multimedia and Expo; Vol. 2019-July). IEEE Computer Society. https://doi.org/10.1109/ICME.2019.00187

@inproceedings{7c4aec5cb0f44b34a61cf122728e38e6,

title = "Learning shape-motion representations from geometric algebra spatio-temporal model for skeleton-based action recognition",

abstract = "Skeleton-based action recognition has been widely applied in intelligent video surveillance and human behavior analysis. Previous works have successfully applied Convolutional Neural Networks (CNN) to learn spatio-temporal characteristics of the skeleton sequence. However, they merely focus on the coordinates of isolated joints, which ignore the spatial relationships between joints and only implicitly learn the motion representations. To solve these problems, we propose an effective method to learn comprehensive representations from skeleton sequences by using Geometric Algebra. Firstly, a frontal orientation based spatio-temporal model is constructed to represent the spatial configuration and temporal dynamics of skeleton sequences, which owns the robustness against view variations. Then the shape-motion representations which mutually compensate are learned to describe skeleton actions comprehensively. Finally, a multi-stream CNN model is applied to extract and fuse deep features from the complementary shape-motion representations. Experimental results on NTU RGB+D and Northwestern-UCLA datasets consistently verify the superiority of our method.",

keywords = "Geometric algebra, Human action recognition, Skeleton sequence, Spatio-temporal model",

author = "Yanshan Li and Rongjie Xia and Xing Liu and Qinghua Huang",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 2019 IEEE International Conference on Multimedia and Expo, ICME 2019 ; Conference date: 08-07-2019 Through 12-07-2019",

year = "2019",

month = jul,

doi = "10.1109/ICME.2019.00187",

language = "英语",

series = "Proceedings - IEEE International Conference on Multimedia and Expo",

publisher = "IEEE Computer Society",

pages = "1066--1071",

booktitle = "Proceedings - 2019 IEEE International Conference on Multimedia and Expo, ICME 2019",

}

Li, Y, Xia, R, Liu, X & Huang, Q 2019, Learning shape-motion representations from geometric algebra spatio-temporal model for skeleton-based action recognition. in Proceedings - 2019 IEEE International Conference on Multimedia and Expo, ICME 2019., 8785009, Proceedings - IEEE International Conference on Multimedia and Expo, vol. 2019-July, IEEE Computer Society, pp. 1066-1071, 2019 IEEE International Conference on Multimedia and Expo, ICME 2019, Shanghai, China, 8/07/19. https://doi.org/10.1109/ICME.2019.00187

Learning shape-motion representations from geometric algebra spatio-temporal model for skeleton-based action recognition. / Li, Yanshan; Xia, Rongjie; Liu, Xing et al.
Proceedings - 2019 IEEE International Conference on Multimedia and Expo, ICME 2019. IEEE Computer Society, 2019. p. 1066-1071 8785009 (Proceedings - IEEE International Conference on Multimedia and Expo; Vol. 2019-July).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Learning shape-motion representations from geometric algebra spatio-temporal model for skeleton-based action recognition

AU - Li, Yanshan

AU - Xia, Rongjie

AU - Liu, Xing

AU - Huang, Qinghua

PY - 2019/7

Y1 - 2019/7

N2 - Skeleton-based action recognition has been widely applied in intelligent video surveillance and human behavior analysis. Previous works have successfully applied Convolutional Neural Networks (CNN) to learn spatio-temporal characteristics of the skeleton sequence. However, they merely focus on the coordinates of isolated joints, which ignore the spatial relationships between joints and only implicitly learn the motion representations. To solve these problems, we propose an effective method to learn comprehensive representations from skeleton sequences by using Geometric Algebra. Firstly, a frontal orientation based spatio-temporal model is constructed to represent the spatial configuration and temporal dynamics of skeleton sequences, which owns the robustness against view variations. Then the shape-motion representations which mutually compensate are learned to describe skeleton actions comprehensively. Finally, a multi-stream CNN model is applied to extract and fuse deep features from the complementary shape-motion representations. Experimental results on NTU RGB+D and Northwestern-UCLA datasets consistently verify the superiority of our method.

AB - Skeleton-based action recognition has been widely applied in intelligent video surveillance and human behavior analysis. Previous works have successfully applied Convolutional Neural Networks (CNN) to learn spatio-temporal characteristics of the skeleton sequence. However, they merely focus on the coordinates of isolated joints, which ignore the spatial relationships between joints and only implicitly learn the motion representations. To solve these problems, we propose an effective method to learn comprehensive representations from skeleton sequences by using Geometric Algebra. Firstly, a frontal orientation based spatio-temporal model is constructed to represent the spatial configuration and temporal dynamics of skeleton sequences, which owns the robustness against view variations. Then the shape-motion representations which mutually compensate are learned to describe skeleton actions comprehensively. Finally, a multi-stream CNN model is applied to extract and fuse deep features from the complementary shape-motion representations. Experimental results on NTU RGB+D and Northwestern-UCLA datasets consistently verify the superiority of our method.

KW - Geometric algebra

KW - Human action recognition

KW - Skeleton sequence

KW - Spatio-temporal model

UR - http://www.scopus.com/inward/record.url?scp=85071022633&partnerID=8YFLogxK

U2 - 10.1109/ICME.2019.00187

DO - 10.1109/ICME.2019.00187

M3 - 会议稿件

AN - SCOPUS:85071022633

T3 - Proceedings - IEEE International Conference on Multimedia and Expo

SP - 1066

EP - 1071

BT - Proceedings - 2019 IEEE International Conference on Multimedia and Expo, ICME 2019

PB - IEEE Computer Society

T2 - 2019 IEEE International Conference on Multimedia and Expo, ICME 2019

Y2 - 8 July 2019 through 12 July 2019

ER -

Li Y, Xia R, Liu X, Huang Q. Learning shape-motion representations from geometric algebra spatio-temporal model for skeleton-based action recognition. In Proceedings - 2019 IEEE International Conference on Multimedia and Expo, ICME 2019. IEEE Computer Society. 2019. p. 1066-1071. 8785009. (Proceedings - IEEE International Conference on Multimedia and Expo). doi: 10.1109/ICME.2019.00187

Learning shape-motion representations from geometric algebra spatio-temporal model for skeleton-based action recognition

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this