MRRNet: Learning multiple region representation for video person re-identification

Hui Fu; Ke Zhang; Haoyu Li; Jingyu Wang

doi:10.1016/j.engappai.2022.105108

MRRNet: Learning multiple region representation for video person re-identification

Hui Fu, Ke Zhang, Haoyu Li, Jingyu Wang

School of Astronautics

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

12 Scopus citations

Abstract

Video person re-identification is a crucial component of a robust surveillance system. Within a video clip, different human regions exhibit unique stability characteristics, which would be harmful to generating a discriminative representation. Unfortunately, prior works cannot effectively deal with the stability characteristics of different regions. To tackle this problem, we propose a Multiple Region Representation Network (MRRNet) that aims to discover the discriminative information from different human regions. Firstly, a Stable Region Representation (SRR) layer is proposed to capture important clues from the stable regions and exchange temporal information by cross-relation aware operation. Secondly, a Multiple Region Representation (MRR) layer is designed to address the unstable regions and preserve the attention on stable regions. Thirdly, SRR and MRR can be conveniently inserted into multiple stages of the deep residual networks and significantly improve the performance of the network. Comprehensive experiments validate the effectiveness of our network. Particularly, MRRNet achieves 86.7% mAP and 91.1% Rank-1 accuracy on the MARS dataset, which outperforms state-of-the-arts.

Original language	English
Article number	105108
Journal	Engineering Applications of Artificial Intelligence
Volume	114
DOIs	https://doi.org/10.1016/j.engappai.2022.105108
State	Published - Sep 2022

Keywords

Cross-relation aware
Multiple region representation
Self-relation aware
Stable region representation
Video person re-identification

Access to Document

10.1016/j.engappai.2022.105108

Cite this

@article{7646205dc4104ee2ae6e8327e0e25636,

title = "MRRNet: Learning multiple region representation for video person re-identification",

abstract = "Video person re-identification is a crucial component of a robust surveillance system. Within a video clip, different human regions exhibit unique stability characteristics, which would be harmful to generating a discriminative representation. Unfortunately, prior works cannot effectively deal with the stability characteristics of different regions. To tackle this problem, we propose a Multiple Region Representation Network (MRRNet) that aims to discover the discriminative information from different human regions. Firstly, a Stable Region Representation (SRR) layer is proposed to capture important clues from the stable regions and exchange temporal information by cross-relation aware operation. Secondly, a Multiple Region Representation (MRR) layer is designed to address the unstable regions and preserve the attention on stable regions. Thirdly, SRR and MRR can be conveniently inserted into multiple stages of the deep residual networks and significantly improve the performance of the network. Comprehensive experiments validate the effectiveness of our network. Particularly, MRRNet achieves 86.7% mAP and 91.1% Rank-1 accuracy on the MARS dataset, which outperforms state-of-the-arts.",

keywords = "Cross-relation aware, Multiple region representation, Self-relation aware, Stable region representation, Video person re-identification",

author = "Hui Fu and Ke Zhang and Haoyu Li and Jingyu Wang",

note = "Publisher Copyright: {\textcopyright} 2022 Elsevier Ltd",

year = "2022",

month = sep,

doi = "10.1016/j.engappai.2022.105108",

language = "英语",

volume = "114",

journal = "Engineering Applications of Artificial Intelligence",

issn = "0952-1976",

publisher = "Elsevier Ltd",

}

TY - JOUR

T1 - MRRNet

T2 - Learning multiple region representation for video person re-identification

AU - Fu, Hui

AU - Zhang, Ke

AU - Li, Haoyu

AU - Wang, Jingyu

PY - 2022/9

Y1 - 2022/9

N2 - Video person re-identification is a crucial component of a robust surveillance system. Within a video clip, different human regions exhibit unique stability characteristics, which would be harmful to generating a discriminative representation. Unfortunately, prior works cannot effectively deal with the stability characteristics of different regions. To tackle this problem, we propose a Multiple Region Representation Network (MRRNet) that aims to discover the discriminative information from different human regions. Firstly, a Stable Region Representation (SRR) layer is proposed to capture important clues from the stable regions and exchange temporal information by cross-relation aware operation. Secondly, a Multiple Region Representation (MRR) layer is designed to address the unstable regions and preserve the attention on stable regions. Thirdly, SRR and MRR can be conveniently inserted into multiple stages of the deep residual networks and significantly improve the performance of the network. Comprehensive experiments validate the effectiveness of our network. Particularly, MRRNet achieves 86.7% mAP and 91.1% Rank-1 accuracy on the MARS dataset, which outperforms state-of-the-arts.

AB - Video person re-identification is a crucial component of a robust surveillance system. Within a video clip, different human regions exhibit unique stability characteristics, which would be harmful to generating a discriminative representation. Unfortunately, prior works cannot effectively deal with the stability characteristics of different regions. To tackle this problem, we propose a Multiple Region Representation Network (MRRNet) that aims to discover the discriminative information from different human regions. Firstly, a Stable Region Representation (SRR) layer is proposed to capture important clues from the stable regions and exchange temporal information by cross-relation aware operation. Secondly, a Multiple Region Representation (MRR) layer is designed to address the unstable regions and preserve the attention on stable regions. Thirdly, SRR and MRR can be conveniently inserted into multiple stages of the deep residual networks and significantly improve the performance of the network. Comprehensive experiments validate the effectiveness of our network. Particularly, MRRNet achieves 86.7% mAP and 91.1% Rank-1 accuracy on the MARS dataset, which outperforms state-of-the-arts.

KW - Cross-relation aware

KW - Multiple region representation

KW - Self-relation aware

KW - Stable region representation

KW - Video person re-identification

UR - http://www.scopus.com/inward/record.url?scp=85133219735&partnerID=8YFLogxK

U2 - 10.1016/j.engappai.2022.105108

DO - 10.1016/j.engappai.2022.105108

M3 - 文章

AN - SCOPUS:85133219735

SN - 0952-1976

VL - 114

JO - Engineering Applications of Artificial Intelligence

JF - Engineering Applications of Artificial Intelligence

M1 - 105108

ER -

MRRNet: Learning multiple region representation for video person re-identification

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this