Gated forward refinement network for action segmentation

Dong Wang, Yuan Yuan, Qi Wang

科研成果: 期刊稿件文章同行评审

18 引用 (Scopus)

摘要

Action segmentation aims at temporally locating and classifying video segments in long untrimmed videos, which is of particular interest to many applications like surveillance and robotics. While most existing methods tackle this task by predicting frame-wise probabilities and adjusting them via high-level temporal models, recent approaches classify every video frame directly with temporal convolutions. However, there are limits to generate high quality predictions due to ambiguous information in the video frames. In this paper, in order to address the limitations of existing methods in temporal action segmentation task, we propose an end-to-end multi-stage architecture, Gated Forward Refinement Network (G-FRNet). In G-FRNet, each stage makes a prediction that is refined progressively by next stage. Specifically, we propose a new gated forward refinement network to adaptively correct the errors in the prediction from previous stage, where an effective gate unit is used to control the refinement process. Moreover, to efficiently optimize the proposed G-FRNet, we design an objective function that consists of a classification loss and a multi-stage sequence-level refinement loss that incorporates segmental edit score via policy gradient. Extensive evaluation on three challenging datasets (50Salads, Georgia Tech Egocentric Activities (GTEA), and the Breakfast dataset) shows our method achieves state-of-the-art results.

源语言英语
页(从-至)63-71
页数9
期刊Neurocomputing
407
DOI
出版状态已出版 - 24 9月 2020

指纹

探究 'Gated forward refinement network for action segmentation' 的科研主题。它们共同构成独一无二的指纹。

引用此