Skip to main navigation Skip to search Skip to main content

Beyond Vision: A Semantic Reasoning Enhanced Model for Gesture Recognition with Improved Spatiotemporal Capacity

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Gesture recognition is an imperative and practical problem owing to its great application potential. Although recent works have made great progress in this field, there also exist three non-negligible problems: 1) existing works lack efficient temporal modeling ability; 2) existing works lack effective spatial attention capacity; 3) most works only focus on the visual information, without considering the semantic relationship between different classes. To tackle the first problem, we propose a Long and Short-term Temporal Shift Module (LS-TSM). It extends the original TSM and expands the step size of shift operation to model long-term and short-term temporal information simultaneously. For the second problem, we expect to focus on the spatial area where the change of hand mainly occurs. Therefore, we propose a Spatial Attention Module (SAM) which utilizes the RGB difference between frames to get a spatial attention mask to assign different weights to different spatial positions. As for the last, we propose a Label Relation Module (LRM) which can take full advantage of the relationship among classes based on their labels’ semantic information. With the proposed modules, our work achieves the state-of-the-art performance on two commonly used gesture datasets, i.e., EgoGesture and NVGesture datasets. Extensive experiments demonstrate the effectiveness of our proposed modules.

Original languageEnglish
Title of host publicationPattern Recognition and Computer Vision - 5th Chinese Conference, PRCV 2022, Proceedings
EditorsShiqi Yu, Jianguo Zhang, Zhaoxiang Zhang, Tieniu Tan, Pong C. Yuen, Yike Guo, Junwei Han, Jianhuang Lai
PublisherSpringer Science and Business Media Deutschland GmbH
Pages420-434
Number of pages15
ISBN (Print)9783031189128
DOIs
StatePublished - 2022
Event5th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2022 - Shenzhen, China
Duration: 4 Nov 20227 Nov 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13536 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference5th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2022
Country/TerritoryChina
CityShenzhen
Period4/11/227/11/22

Keywords

  • Gesture recognition
  • Semantic relation
  • Spatial attention
  • Temporal modeling

Fingerprint

Dive into the research topics of 'Beyond Vision: A Semantic Reasoning Enhanced Model for Gesture Recognition with Improved Spatiotemporal Capacity'. Together they form a unique fingerprint.

Cite this