TY - GEN
T1 - A Flow Base Bi-path Network for Cross-Scene Video Crowd Understanding in Aerial View
AU - Zhao, Zhiyuan
AU - Han, Tao
AU - Gao, Junyu
AU - Wang, Qi
AU - Li, Xuelong
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - Drones shooting can be applied in dynamic traffic monitoring, object detecting and tracking, and other vision tasks. The variability of the shooting location adds some intractable challenges to these missions, such as varying scale, unstable exposure, and scene migration. In this paper, we strive to tackle the above challenges and automatically understand the crowd from the visual data collected from drones. First, to alleviate the background noise generated in cross-scene testing, a double-stream crowd counting model is proposed, which extracts optical flow and frame difference information as an additional branch. Besides, to improve the model’s generalization ability at different scales and time, we randomly combine a variety of data transformation methods to simulate some unseen environments. To tackle the crowd density estimation problem under extreme dark environments, we introduce synthetic data generated by game Grand Theft Auto V(GTAV). Experiment results show the effectiveness of the virtual data. Our method wins the challenge with a mean absolute error (MAE) of 12.701. Moreover, a comprehensive ablation study is conducted to explore each component’s contribution.
AB - Drones shooting can be applied in dynamic traffic monitoring, object detecting and tracking, and other vision tasks. The variability of the shooting location adds some intractable challenges to these missions, such as varying scale, unstable exposure, and scene migration. In this paper, we strive to tackle the above challenges and automatically understand the crowd from the visual data collected from drones. First, to alleviate the background noise generated in cross-scene testing, a double-stream crowd counting model is proposed, which extracts optical flow and frame difference information as an additional branch. Besides, to improve the model’s generalization ability at different scales and time, we randomly combine a variety of data transformation methods to simulate some unseen environments. To tackle the crowd density estimation problem under extreme dark environments, we introduce synthetic data generated by game Grand Theft Auto V(GTAV). Experiment results show the effectiveness of the virtual data. Our method wins the challenge with a mean absolute error (MAE) of 12.701. Moreover, a comprehensive ablation study is conducted to explore each component’s contribution.
KW - Crowd counting
KW - Data augmentation
KW - Optical flow
KW - Synthetic data
UR - http://www.scopus.com/inward/record.url?scp=85101753757&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-66823-5_34
DO - 10.1007/978-3-030-66823-5_34
M3 - 会议稿件
AN - SCOPUS:85101753757
SN - 9783030668228
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 574
EP - 587
BT - Computer Vision – ECCV 2020 Workshops, Proceedings
A2 - Bartoli, Adrien
A2 - Fusiello, Andrea
PB - Springer Science and Business Media Deutschland GmbH
T2 - Workshops held at the 16th European Conference on Computer Vision, ECCV 2020
Y2 - 23 August 2020 through 28 August 2020
ER -