Micro-gesture Online Recognition with Dual-stream Multi-scale Transformer in Long Videos

Yuhan Wang, Ke Rui Linghu, Hexiang Huang, Zhaoqiang Xia

Research output: Contribution to journalConference articlepeer-review

Abstract

Micro-gestures are increasingly recognized as a key indicator in the field of emotion analysis and have garnered growing interest within the field. The majority of research efforts have been directed towards the classification of micro-gestures, which entails predicting their categories. However, comparatively fewer studies have been dedicated to the detection of micro-gestures. Micro-gesture online recognition (spotting), which involves predicting both the temporal position and the category, is a preliminary step for classification but has received limited attention. In this context, we construct a deep network with dual-stream input for micro-gesture online recognition. Specifically, we utilize a sequential action recognition model to extract motion features from RGB and skeleton sequences separately, which are then processed by the multi-scale Transformer encoder as detection model. The proposed network are trained in a two-stage strategy and combined to perform the temporal spotting. Our proposed method is validated on the SMG dataset and has achieved the first ranking in the task of online recognition from the MiGA2024 Challenge Track 2.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume3848
StatePublished - 2024
Event2024 IJCAI Workshop and Challenge on Micro-Gesture Analysis for Hidden Emotion Understanding, MiGA 2024 - Jeju, Korea, Republic of
Duration: 4 Aug 2024 → …

Keywords

  • Dual-stream network
  • Micro-gesture online recognition
  • Multi-scale Transformer

Fingerprint

Dive into the research topics of 'Micro-gesture Online Recognition with Dual-stream Multi-scale Transformer in Long Videos'. Together they form a unique fingerprint.

Cite this