Abstract
Person re-identification is a critical yet challenging task in video surveillance. It aims to match the same person across cameras. Practically, people's appearances vary greatly across cameras. Most deep learning methods rely on single-level features of deep layer while ignoring low-level detailed features of shallow layers, since different layers have different sizes of feature maps and different layers' features cannot be concatenated without extra downsampling or upsampling. To remedy this problem, we propose a novel yet simple self-attention learning method for person re-identification. We design a convolutional neural network(CNN) to capture multi-level information from different layers while keeping spatial resolution of feature maps unchanged by using dilated convolution. Multi-level information consists of two parts: multi-level attention maps and multi-level feature maps. Multi-level attention maps are constrained and multi-level feature maps are concatenated easily. And we combine softmax loss with quadruplet loss, taking full advantages of labels and metric learning at the same time. Experimental results demonstrate the proposed method achieves excellent performance for person re-identification and self-attention constraint can also be used in many other tasks.
Original language | English |
---|---|
State | Published - 2019 |
Event | 29th British Machine Vision Conference, BMVC 2018 - Newcastle, United Kingdom Duration: 3 Sep 2018 → 6 Sep 2018 |
Conference
Conference | 29th British Machine Vision Conference, BMVC 2018 |
---|---|
Country/Territory | United Kingdom |
City | Newcastle |
Period | 3/09/18 → 6/09/18 |