Learning motion-guided salience features for weakly supervised group activity recognition

Zexing Du, Qing Wang

Research output: Contribution to journalArticlepeer-review

Abstract

This paper focuses on exploring motion-guided features for weakly supervised group activity recognition (GAR). Unlike existing GAR methods that simply squeeze extracted tokens or individual features into a single vector by global pooling, limiting their ability to sufficiently represent spatial and temporal salience features in videos, we propose a Motion-Guided Network (MGN) to capture crucial motion contextual information in videos. First, we embed local correlations between the feature maps of adjacent frames to extract motion features in activities. Then, unlike previous works that simply aggregate motion and appearance features by addition or concatenation, MGN uses motion representations to guide the extraction of temporal and spatial features. We have evaluated the proposed method on sports and group activity videos. Extensive experimental results verify the effectiveness of our method. Furthermore, our method has also outperformed some approaches trained with stronger supervision in the comparative evaluation.

Original languageEnglish
Article number111437
JournalEngineering Applications of Artificial Intelligence
Volume158
DOIs
StatePublished - 15 Oct 2025

Keywords

  • Group activity recognition
  • Motion-guided representations
  • Spatial–temporal salience features
  • Weakly supervised learning

Fingerprint

Dive into the research topics of 'Learning motion-guided salience features for weakly supervised group activity recognition'. Together they form a unique fingerprint.

Cite this