Mix Dimension in Poincaré Geometry for 3D Skeleton-based Action Recognition

Wei Peng; Jingang Shi; Zhaoqiang Xia; Guoying Zhao

doi:10.1145/3394171.3413910

Mix Dimension in Poincaré Geometry for 3D Skeleton-based Action Recognition

Wei Peng, Jingang Shi, Zhaoqiang Xia, Guoying Zhao

School of Electronics and Information

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

58 Scopus citations

Abstract

Graph Convolutional Networks (GCNs) have already demonstrated their powerful ability to model the irregular data, e.g., skeletal data in human action recognition, providing an exciting new way to fuse rich structural information for nodes residing in different parts of a graph. In human action recognition, current works introduce a dynamic graph generation mechanism to better capture the underlying semantic skeleton connections and thus improves the performance. In this paper, we provide an orthogonal way to explore the underlying connections. Instead of introducing an expensive dynamic graph generation paradigm, we build a more efficient GCN on a Riemann manifold, which we think is a more suitable space to model the graph data, to make the extracted representations fit the embedding matrix. Specifically, we present a novel spatial-temporal GCN (ST-GCN) architecture which is defined via the Poincaré geometry such that it is able to better model the latent anatomy of the structure data. To further explore the optimal projection dimension in the Riemann space, we mix different dimensions on the manifold and provide an efficient way to explore the dimension for each ST-GCN layer. With the final resulted architecture, we evaluate our method on two current largest scale 3D datasets, i.e., NTU RGB+D and NTU RGB+D 120. The comparison results show that the model could achieve a superior performance under any given evaluation metrics with only 40% model size when compared with the previous best GCN method, which proves the effectiveness of our model.

Original language	English
Title of host publication	MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia
Publisher	Association for Computing Machinery, Inc
Pages	1432-1440
Number of pages	9
ISBN (Electronic)	9781450379885
DOIs	https://doi.org/10.1145/3394171.3413910
State	Published - 12 Oct 2020
Event	28th ACM International Conference on Multimedia, MM 2020 - Virtual, Online, United States Duration: 12 Oct 2020 → 16 Oct 2020

Publication series

Name	MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia

Conference

Conference	28th ACM International Conference on Multimedia, MM 2020
Country/Territory	United States
City	Virtual, Online
Period	12/10/20 → 16/10/20

Keywords

graph convolutional networks
graph topology analysis
riemann manifold
skeleton-based action recognition

Access to Document

10.1145/3394171.3413910

Cite this

Peng, W., Shi, J., Xia, Z., & Zhao, G. (2020). Mix Dimension in Poincaré Geometry for 3D Skeleton-based Action Recognition. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 1432-1440). (MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia). Association for Computing Machinery, Inc. https://doi.org/10.1145/3394171.3413910

@inproceedings{010952e7998748948b5e9d519512c914,

title = "Mix Dimension in Poincar{\'e} Geometry for 3D Skeleton-based Action Recognition",

abstract = "Graph Convolutional Networks (GCNs) have already demonstrated their powerful ability to model the irregular data, e.g., skeletal data in human action recognition, providing an exciting new way to fuse rich structural information for nodes residing in different parts of a graph. In human action recognition, current works introduce a dynamic graph generation mechanism to better capture the underlying semantic skeleton connections and thus improves the performance. In this paper, we provide an orthogonal way to explore the underlying connections. Instead of introducing an expensive dynamic graph generation paradigm, we build a more efficient GCN on a Riemann manifold, which we think is a more suitable space to model the graph data, to make the extracted representations fit the embedding matrix. Specifically, we present a novel spatial-temporal GCN (ST-GCN) architecture which is defined via the Poincar{\'e} geometry such that it is able to better model the latent anatomy of the structure data. To further explore the optimal projection dimension in the Riemann space, we mix different dimensions on the manifold and provide an efficient way to explore the dimension for each ST-GCN layer. With the final resulted architecture, we evaluate our method on two current largest scale 3D datasets, i.e., NTU RGB+D and NTU RGB+D 120. The comparison results show that the model could achieve a superior performance under any given evaluation metrics with only 40% model size when compared with the previous best GCN method, which proves the effectiveness of our model.",

keywords = "graph convolutional networks, graph topology analysis, riemann manifold, skeleton-based action recognition",

author = "Wei Peng and Jingang Shi and Zhaoqiang Xia and Guoying Zhao",

note = "Publisher Copyright: {\textcopyright} 2020 ACM.; 28th ACM International Conference on Multimedia, MM 2020 ; Conference date: 12-10-2020 Through 16-10-2020",

year = "2020",

month = oct,

day = "12",

doi = "10.1145/3394171.3413910",

language = "英语",

series = "MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia",

publisher = "Association for Computing Machinery, Inc",

pages = "1432--1440",

booktitle = "MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia",

}

Peng, W, Shi, J, Xia, Z & Zhao, G 2020, Mix Dimension in Poincaré Geometry for 3D Skeleton-based Action Recognition. in MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia. MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia, Association for Computing Machinery, Inc, pp. 1432-1440, 28th ACM International Conference on Multimedia, MM 2020, Virtual, Online, United States, 12/10/20. https://doi.org/10.1145/3394171.3413910

Mix Dimension in Poincaré Geometry for 3D Skeleton-based Action Recognition. / Peng, Wei; Shi, Jingang; Xia, Zhaoqiang et al.
MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia. Association for Computing Machinery, Inc, 2020. p. 1432-1440 (MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Mix Dimension in Poincaré Geometry for 3D Skeleton-based Action Recognition

AU - Peng, Wei

AU - Shi, Jingang

AU - Xia, Zhaoqiang

AU - Zhao, Guoying

PY - 2020/10/12

Y1 - 2020/10/12

N2 - Graph Convolutional Networks (GCNs) have already demonstrated their powerful ability to model the irregular data, e.g., skeletal data in human action recognition, providing an exciting new way to fuse rich structural information for nodes residing in different parts of a graph. In human action recognition, current works introduce a dynamic graph generation mechanism to better capture the underlying semantic skeleton connections and thus improves the performance. In this paper, we provide an orthogonal way to explore the underlying connections. Instead of introducing an expensive dynamic graph generation paradigm, we build a more efficient GCN on a Riemann manifold, which we think is a more suitable space to model the graph data, to make the extracted representations fit the embedding matrix. Specifically, we present a novel spatial-temporal GCN (ST-GCN) architecture which is defined via the Poincaré geometry such that it is able to better model the latent anatomy of the structure data. To further explore the optimal projection dimension in the Riemann space, we mix different dimensions on the manifold and provide an efficient way to explore the dimension for each ST-GCN layer. With the final resulted architecture, we evaluate our method on two current largest scale 3D datasets, i.e., NTU RGB+D and NTU RGB+D 120. The comparison results show that the model could achieve a superior performance under any given evaluation metrics with only 40% model size when compared with the previous best GCN method, which proves the effectiveness of our model.

AB - Graph Convolutional Networks (GCNs) have already demonstrated their powerful ability to model the irregular data, e.g., skeletal data in human action recognition, providing an exciting new way to fuse rich structural information for nodes residing in different parts of a graph. In human action recognition, current works introduce a dynamic graph generation mechanism to better capture the underlying semantic skeleton connections and thus improves the performance. In this paper, we provide an orthogonal way to explore the underlying connections. Instead of introducing an expensive dynamic graph generation paradigm, we build a more efficient GCN on a Riemann manifold, which we think is a more suitable space to model the graph data, to make the extracted representations fit the embedding matrix. Specifically, we present a novel spatial-temporal GCN (ST-GCN) architecture which is defined via the Poincaré geometry such that it is able to better model the latent anatomy of the structure data. To further explore the optimal projection dimension in the Riemann space, we mix different dimensions on the manifold and provide an efficient way to explore the dimension for each ST-GCN layer. With the final resulted architecture, we evaluate our method on two current largest scale 3D datasets, i.e., NTU RGB+D and NTU RGB+D 120. The comparison results show that the model could achieve a superior performance under any given evaluation metrics with only 40% model size when compared with the previous best GCN method, which proves the effectiveness of our model.

KW - graph convolutional networks

KW - graph topology analysis

KW - riemann manifold

KW - skeleton-based action recognition

UR - http://www.scopus.com/inward/record.url?scp=85106922690&partnerID=8YFLogxK

U2 - 10.1145/3394171.3413910

DO - 10.1145/3394171.3413910

M3 - 会议稿件

AN - SCOPUS:85106922690

T3 - MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia

SP - 1432

EP - 1440

BT - MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia

PB - Association for Computing Machinery, Inc

T2 - 28th ACM International Conference on Multimedia, MM 2020

Y2 - 12 October 2020 through 16 October 2020

ER -

Mix Dimension in Poincaré Geometry for 3D Skeleton-based Action Recognition

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this