TY - GEN
T1 - Multi-rate gated recurrent convolutional networks for video-based pedestrian re-identification
AU - Li, Zhihui
AU - Yao, Lina
AU - Nie, Feiping
AU - Zhang, Dingwen
AU - Xu, Min
N1 - Publisher Copyright:
Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2018
Y1 - 2018
N2 - Matching pedestrians across multiple camera views has attracted lots of recent research attention due to its apparent importance in surveillance and security applications. While most existing works address this problem in a still-image setting, we consider the more informative and challenging video-based person re-identification problem, where a video of a pedestrian as seen in one camera needs to be matched to a gallery of videos captured by other non-overlapping cameras. We employ a convolutional network to extract the appearance and motion features from raw video sequences, and then feed them into a multi-rate recurrent network to exploit the temporal correlations, and more importantly, to take into account the fact that pedestrians, sometimes even the same pedestrian, move in different speeds across different camera views. The combined network is trained in an end-to-end fashion, and we further propose an initialization strategy via context reconstruction to largely improve the performance. We conduct extensive experiments on the iLIDS-VID and PRID-2011 datasets, and our experimental results confirm the effectiveness and the generalization ability of our model.
AB - Matching pedestrians across multiple camera views has attracted lots of recent research attention due to its apparent importance in surveillance and security applications. While most existing works address this problem in a still-image setting, we consider the more informative and challenging video-based person re-identification problem, where a video of a pedestrian as seen in one camera needs to be matched to a gallery of videos captured by other non-overlapping cameras. We employ a convolutional network to extract the appearance and motion features from raw video sequences, and then feed them into a multi-rate recurrent network to exploit the temporal correlations, and more importantly, to take into account the fact that pedestrians, sometimes even the same pedestrian, move in different speeds across different camera views. The combined network is trained in an end-to-end fashion, and we further propose an initialization strategy via context reconstruction to largely improve the performance. We conduct extensive experiments on the iLIDS-VID and PRID-2011 datasets, and our experimental results confirm the effectiveness and the generalization ability of our model.
UR - http://www.scopus.com/inward/record.url?scp=85052237737&partnerID=8YFLogxK
M3 - 会议稿件
AN - SCOPUS:85052237737
T3 - 32nd AAAI Conference on Artificial Intelligence, AAAI 2018
SP - 7081
EP - 7088
BT - 32nd AAAI Conference on Artificial Intelligence, AAAI 2018
PB - AAAI press
T2 - 32nd AAAI Conference on Artificial Intelligence, AAAI 2018
Y2 - 2 February 2018 through 7 February 2018
ER -