PCC Net: Perspective crowd counting via spatial convolutional network

Junyu Gao; Qi Wang; Xuelong Li

doi:10.1109/TCSVT.2019.2919139

PCC Net: Perspective crowd counting via spatial convolutional network

Junyu Gao, Qi Wang, Xuelong Li

光电与智能研究院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

201 引用（Scopus）

摘要

Crowd counting from a single image is a challenging task due to high appearance similarity, perspective changes, and severe congestion. Many methods only focus on the local appearance features and they cannot handle the aforementioned challenges. In order to tackle them, we propose a perspective crowd counting network (PCC Net), which consists of three parts: 1) density map estimation (DME) focuses on learning very local features of density map estimation; 2) random high-level density classification (R-HDC) extracts global features to predict the coarse density labels of random patches in images; and 3) fore-/background segmentation (FBS) encodes mid-level features to segments the foreground and background. Besides, the Down, Up, Left, and Right (DULR) module is embedded in PCC Net to encode the perspective changes on four directions (DULR). The proposed PCC Net is verified on five mainstream datasets, which achieves the state-of-the-art performance on the one and attains the competitive results on the other four datasets. The source code is available at https://github.com/gjy3035/PCC-Net.

源语言	英语
文章编号	8723079
页（从-至）	3486-3498
页数	13
期刊	IEEE Transactions on Circuits and Systems for Video Technology
卷	30
期	10
DOI	https://doi.org/10.1109/TCSVT.2019.2919139
出版状态	已出版 - 10月 2020

访问文件

10.1109/TCSVT.2019.2919139

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{a328183a18ae488cb38c293531d4a3cc,

title = "PCC Net: Perspective crowd counting via spatial convolutional network",

abstract = "Crowd counting from a single image is a challenging task due to high appearance similarity, perspective changes, and severe congestion. Many methods only focus on the local appearance features and they cannot handle the aforementioned challenges. In order to tackle them, we propose a perspective crowd counting network (PCC Net), which consists of three parts: 1) density map estimation (DME) focuses on learning very local features of density map estimation; 2) random high-level density classification (R-HDC) extracts global features to predict the coarse density labels of random patches in images; and 3) fore-/background segmentation (FBS) encodes mid-level features to segments the foreground and background. Besides, the Down, Up, Left, and Right (DULR) module is embedded in PCC Net to encode the perspective changes on four directions (DULR). The proposed PCC Net is verified on five mainstream datasets, which achieves the state-of-the-art performance on the one and attains the competitive results on the other four datasets. The source code is available at https://github.com/gjy3035/PCC-Net.",

keywords = "Crowd counting, background segmentation, crowd analysis, multi-task learning, spatial convolutional network",

author = "Junyu Gao and Qi Wang and Xuelong Li",

note = "Publisher Copyright: {\textcopyright} 1991-2012 IEEE.",

year = "2020",

month = oct,

doi = "10.1109/TCSVT.2019.2919139",

language = "英语",

volume = "30",

pages = "3486--3498",

journal = "IEEE Transactions on Circuits and Systems for Video Technology",

issn = "1051-8215",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "10",

}

TY - JOUR

T1 - PCC Net

T2 - Perspective crowd counting via spatial convolutional network

AU - Gao, Junyu

AU - Wang, Qi

AU - Li, Xuelong

PY - 2020/10

Y1 - 2020/10

N2 - Crowd counting from a single image is a challenging task due to high appearance similarity, perspective changes, and severe congestion. Many methods only focus on the local appearance features and they cannot handle the aforementioned challenges. In order to tackle them, we propose a perspective crowd counting network (PCC Net), which consists of three parts: 1) density map estimation (DME) focuses on learning very local features of density map estimation; 2) random high-level density classification (R-HDC) extracts global features to predict the coarse density labels of random patches in images; and 3) fore-/background segmentation (FBS) encodes mid-level features to segments the foreground and background. Besides, the Down, Up, Left, and Right (DULR) module is embedded in PCC Net to encode the perspective changes on four directions (DULR). The proposed PCC Net is verified on five mainstream datasets, which achieves the state-of-the-art performance on the one and attains the competitive results on the other four datasets. The source code is available at https://github.com/gjy3035/PCC-Net.

AB - Crowd counting from a single image is a challenging task due to high appearance similarity, perspective changes, and severe congestion. Many methods only focus on the local appearance features and they cannot handle the aforementioned challenges. In order to tackle them, we propose a perspective crowd counting network (PCC Net), which consists of three parts: 1) density map estimation (DME) focuses on learning very local features of density map estimation; 2) random high-level density classification (R-HDC) extracts global features to predict the coarse density labels of random patches in images; and 3) fore-/background segmentation (FBS) encodes mid-level features to segments the foreground and background. Besides, the Down, Up, Left, and Right (DULR) module is embedded in PCC Net to encode the perspective changes on four directions (DULR). The proposed PCC Net is verified on five mainstream datasets, which achieves the state-of-the-art performance on the one and attains the competitive results on the other four datasets. The source code is available at https://github.com/gjy3035/PCC-Net.

KW - Crowd counting

KW - background segmentation

KW - crowd analysis

KW - multi-task learning

KW - spatial convolutional network

UR - http://www.scopus.com/inward/record.url?scp=85092446345&partnerID=8YFLogxK

U2 - 10.1109/TCSVT.2019.2919139

DO - 10.1109/TCSVT.2019.2919139

M3 - 文章

AN - SCOPUS:85092446345

SN - 1051-8215

VL - 30

SP - 3486

EP - 3498

JO - IEEE Transactions on Circuits and Systems for Video Technology

JF - IEEE Transactions on Circuits and Systems for Video Technology

IS - 10

M1 - 8723079

ER -

PCC Net: Perspective crowd counting via spatial convolutional network

摘要

访问文件

其它文件与链接

指纹

引用此