SCOAD: Single-Frame Click Supervision for Online Action Detection

Na Ye, Xing Zhang, Dawei Yan, Wei Dong, Qingsen Yan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Online action detection based on supervised learning requires heavy manual annotation, which is difficult to obtain and may be impractical in real applications. Weakly supervised online action detection (WOAD) can effectively mitigate the problem of substantial labeling costs by using video-level labels. In this paper, we revisit WOAD and propose a weakly supervised online action detection using click-level labels for training, named Single-frame Click Supervision for Online Action Detection (SCOAD). Comparatively, click-level labels can effectively improve prediction accuracy by carrying a small amount of temporal information without massively increase the difficulty and cost of annotation. Specifically, SCOAD includes two joint training modules, i.e., Action Instance Miner (AIM) and Online Action Detector (OAD). To provide more guidance for training network as accuracy as possible, AIM mines pseudo-action instances under the supervision of click labels. Meanwhile, we generate video similarity instances offline by the similarity between video frames and use it to perform finer granularity filtering of error instances generated by AIM. OAD is trained jointly with AIM for online action detection by the pseudo frame-level labels converted from the filtered pseudo-action instances. We conduct extensive experiments on two benchmark datasets to demonstrate that SCOAD can effectively mine and utilize the small amount of temporal information in click-level labels. Code is available at https://github.com/zstarN70/SCOAD.git.

Original languageEnglish
Title of host publicationComputer Vision – ACCV 2022 - 16th Asian Conference on Computer Vision, Proceedings
EditorsLei Wang, Juergen Gall, Tat-Jun Chin, Imari Sato, Rama Chellappa
PublisherSpringer Science and Business Media Deutschland GmbH
Pages223-238
Number of pages16
ISBN (Print)9783031263156
DOIs
StatePublished - 2023
Event16th Asian Conference on Computer Vision, ACCV 2022 - Macao, China
Duration: 4 Dec 20228 Dec 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13844 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th Asian Conference on Computer Vision, ACCV 2022
Country/TerritoryChina
CityMacao
Period4/12/228/12/22

Keywords

  • Online action detection
  • Weakly supervised learning

Fingerprint

Dive into the research topics of 'SCOAD: Single-Frame Click Supervision for Online Action Detection'. Together they form a unique fingerprint.

Cite this