Scene-Dependent Prediction in Latent Space for Video Anomaly Detection and Anticipation

Congqi Cao, Hanwen Zhang, Yue Lu, Peng Wang, Yanning Zhang

科研成果: 期刊稿件文章同行评审

6 引用 (Scopus)

摘要

Video anomaly detection (VAD) plays a crucial role in intelligent surveillance. However, an essential type of anomaly named scene-dependent anomaly is overlooked. Moreover, the task of video anomaly anticipation (VAA) also deserves attention. To fill these gaps, we build a comprehensive dataset named NWPU Campus, which is the largest semi-supervised VAD dataset and the first dataset for scene-dependent VAD and VAA. Meanwhile, we introduce a novel forward-backward framework for scene-dependent VAD and VAA, in which the forward network individually solves the VAD and jointly solves the VAA with the backward network. Particularly, we propose a scene-dependent generative model in latent space for the forward and backward networks. First, we propose a hierarchical variational auto-encoder to extract scene-generic features. Next, we design a score-based diffusion model in latent space to refine these features more compact for the task and generate scene-dependent features with a scene information auto-encoder, modeling the relationships between video events and scenes. Finally, we develop a temporal loss from key frames to constrain the motion consistency of video clips. Extensive experiments demonstrate that our method can handle both scene-dependent anomaly detection and anticipation well, achieving state-of-the-art performance on ShanghaiTech, CUHK Avenue, and the proposed NWPU Campus datasets.

源语言英语
页(从-至)224-239
页数16
期刊IEEE Transactions on Pattern Analysis and Machine Intelligence
47
1
DOI
出版状态已出版 - 2025

指纹

探究 'Scene-Dependent Prediction in Latent Space for Video Anomaly Detection and Anticipation' 的科研主题。它们共同构成独一无二的指纹。

引用此