Text-based Person Search in Full Images via Semantic-Driven Proposal Generation

Shizhou Zhang, De Cheng, Wenlong Luo, Yinghui Xing, Duo Long, Hao Li, Kai Niu, Guoqiang Liang, Yanning Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

Finding target persons in full scene images with a query of text description has important practical applications in intelligent video surveillance. However, different from the real-world scenarios where the bounding boxes are not available, existing text-based person re- trieval methods mainly focus on the cross modal matching between the query text descriptions and the gallery of cropped pedestrian images. To close the gap, we study the problem of text-based person search in full images by proposing a new end-to-end learning framework which jointly optimize the pedestrian detection, identification and visual-semantic feature embedding tasks. To take full advantage of the query text, the semantic features are leveraged to instruct the Region Proposal Network to pay more attention to the text-described proposals. Besides, a cross-scale visual-semantic embedding mechanism is utilized to improve the performance. To validate the proposed method, we collect and annotate two large-scale benchmark datasets based on the widely adopted image-based person search datasets CUHK-SYSU and PRW. Comprehensive experiments are conducted on the two datasets and compared with the baseline methods, our method achieves the state-of-the-art performance.

Original languageEnglish
Title of host publicationHCMA 2023 - Proceedings of the 4th International Workshop on Human-centric Multimedia Analysis, Co-located with
Subtitle of host publicationMM 2023
PublisherAssociation for Computing Machinery, Inc
Pages5-14
Number of pages10
ISBN (Electronic)9798400702723
DOIs
StatePublished - 2 Nov 2023
Event4th International Workshop on Human-centric Multimedia Analysis, HCMA 2023 - Ottawa, Canada
Duration: 2 Nov 2023 → …

Publication series

NameHCMA 2023 - Proceedings of the 4th International Workshop on Human-centric Multimedia Analysis, Co-located with: MM 2023

Conference

Conference4th International Workshop on Human-centric Multimedia Analysis, HCMA 2023
Country/TerritoryCanada
CityOttawa
Period2/11/23 → …

Keywords

  • cross scale alignment.
  • semantic-driven rpn
  • text-based person search

Fingerprint

Dive into the research topics of 'Text-based Person Search in Full Images via Semantic-Driven Proposal Generation'. Together they form a unique fingerprint.

Cite this