Deep representation for abnormal event detection in crowded scenes

Yachuang Feng, Yuan Yuan, Xiaoqiang Lu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

38 Scopus citations

Abstract

Abnormal event detection is extremely important, especially for video surveillance. Nowadays, many detectors have been proposed based on hand-crafted features. However, it remains challenging to effectively distinguish abnormal events from normal ones. This paper proposes a deep representation based algorithm which extracts features in an unsupervised fashion. Specially, appearance, texture, and short-term motion features are automatically learned and fused with stacked denoising autoencoders. Subsequently, long-term temporal clues are modeled with a long short-term memory (LSTM) recurrent network, in order to discover meaningful regularities of video events. The abnormal events are identified as samples which disobey these regularities. Moreover, this paper proposes a spatial anomaly detection strategy via manifold ranking, aiming at excluding false alarms. Experiments and comparisons on real world datasets show that the proposed algorithm outper-forms state of the arts for the abnormal event detection problem in crowded scenes.

Original languageEnglish
Title of host publicationMM 2016 - Proceedings of the 2016 ACM Multimedia Conference
PublisherAssociation for Computing Machinery, Inc
Pages591-595
Number of pages5
ISBN (Electronic)9781450336031
DOIs
StatePublished - 1 Oct 2016
Externally publishedYes
Event24th ACM Multimedia Conference, MM 2016 - Amsterdam, United Kingdom
Duration: 15 Oct 201619 Oct 2016

Publication series

NameMM 2016 - Proceedings of the 2016 ACM Multimedia Conference

Conference

Conference24th ACM Multimedia Conference, MM 2016
Country/TerritoryUnited Kingdom
CityAmsterdam
Period15/10/1619/10/16

Keywords

  • Abnormal event detection
  • Crowded scene
  • Deep representation
  • Video surveillance

Fingerprint

Dive into the research topics of 'Deep representation for abnormal event detection in crowded scenes'. Together they form a unique fingerprint.

Cite this