Scale-aware local difference attention on pyramidal features for crowd counting

Qian Zhang, Shizhou Zhang, Xinyao Liu, Yanning Zhang

Research output: Contribution to journalArticlepeer-review

Abstract

Estimating crowd counts automatically via computer vision technology has been attracting great attention due to its numerous practical applications. The crowd counting task has many challenges, and one of the main difficulties is scale variation since the scales of people’s heads vary dramatically across various images and between different regions of the same image. In this paper, we tackle the problem by proposing a novel scale-aware counting model named FPN-LDA Net, where the Feature Pyramid Network (FPN) handles the scale variation problem by fusing multi-scale feature maps from different depth levels of the network and the Local Difference Attention (LDA) module captures the local differences between the multi-scale pyramid pooling features at a specific location and its neighborhood. To tackle the head scale variation within the same image, the dynamically learned difference scores are utilized as the weights to adaptively highlight the scale-varying head regions of the crowd which need to be focused and filter irrelevant background regions. We conduct extensive experiments on three widely adopted benchmark datasets UCF-QNRF, ShanghaiTech and UCF_CC_50. And the experimental results showed the superiority of the proposed method.

Original languageEnglish
Pages (from-to)5165-5180
Number of pages16
JournalMultimedia Tools and Applications
Volume83
Issue number2
DOIs
StatePublished - Jan 2024

Keywords

  • Attention mechanism
  • Convolutional neural network
  • Crowding counting
  • Deep learning
  • FPN

Fingerprint

Dive into the research topics of 'Scale-aware local difference attention on pyramidal features for crowd counting'. Together they form a unique fingerprint.

Cite this