NWPU-MOC: A Benchmark for Fine-Grained Multicategory Object Counting in Aerial Images

Junyu Gao; Liangliang Zhao; Xuelong Li

doi:10.1109/TGRS.2024.3356492

NWPU-MOC: A Benchmark for Fine-Grained Multicategory Object Counting in Aerial Images

Junyu Gao, Liangliang Zhao, Xuelong Li

School of Artificial Intelligence, OPtics and Electronics

Research output: Contribution to journal › Article › peer-review

30 Scopus citations

Abstract

Object counting is a hot topic in computer vision, which aims to estimate the number of objects in a given image. However, most methods only count objects of a single category for an image, which cannot be applied to scenes that need to count objects with multiple categories simultaneously, especially in aerial scenes. To this end, this article introduces a multicategory object-counting (MOC) task to estimate the numbers of different objects (cars, buildings, ships, etc.) in an aerial image. Considering the absence of a dataset for this task, a large-scale dataset (NWPU-MOC) is collected, consisting of 3416 scenes with a resolution of $1024\times1024$ pixels, and well annotated using 14 fine-grained object categories. Besides, each scene contains RGB and near infrared (NIR) images, of which the NIR spectrum can provide richer characterization information compared with only the RGB spectrum. Based on NWPU-MOC, the article presents a multispectrum, MOC framework, which employs a dual-Attention module to fuse the features of RGB and NIR and subsequently regress multichannel density maps corresponding to each object category. In addition to modeling the dependence between different channels in the density map with each object category, a spatial contrast loss is designed as a penalty for overlapping predictions at the same spatial position. Experimental results demonstrate that the proposed method achieves state-of-The-Art performance compared with some mainstream counting algorithms. The dataset, code, and models are publicly available at https://github.com/lyongo/NWPU-MOC.

Original language	English
Article number	5606614
Pages (from-to)	1-14
Number of pages	14
Journal	IEEE Transactions on Geoscience and Remote Sensing
Volume	62
DOIs	https://doi.org/10.1109/TGRS.2024.3356492
State	Published - 2024

Keywords

Benchmark
multispectral aerial image
object counting
remote sensing

Access to Document

10.1109/TGRS.2024.3356492

Cite this

@article{806d9e6053b6411b9c95cf8038358bff,

title = "NWPU-MOC: A Benchmark for Fine-Grained Multicategory Object Counting in Aerial Images",

abstract = "Object counting is a hot topic in computer vision, which aims to estimate the number of objects in a given image. However, most methods only count objects of a single category for an image, which cannot be applied to scenes that need to count objects with multiple categories simultaneously, especially in aerial scenes. To this end, this article introduces a multicategory object-counting (MOC) task to estimate the numbers of different objects (cars, buildings, ships, etc.) in an aerial image. Considering the absence of a dataset for this task, a large-scale dataset (NWPU-MOC) is collected, consisting of 3416 scenes with a resolution of $1024\times1024$ pixels, and well annotated using 14 fine-grained object categories. Besides, each scene contains RGB and near infrared (NIR) images, of which the NIR spectrum can provide richer characterization information compared with only the RGB spectrum. Based on NWPU-MOC, the article presents a multispectrum, MOC framework, which employs a dual-Attention module to fuse the features of RGB and NIR and subsequently regress multichannel density maps corresponding to each object category. In addition to modeling the dependence between different channels in the density map with each object category, a spatial contrast loss is designed as a penalty for overlapping predictions at the same spatial position. Experimental results demonstrate that the proposed method achieves state-of-The-Art performance compared with some mainstream counting algorithms. The dataset, code, and models are publicly available at https://github.com/lyongo/NWPU-MOC.",

keywords = "Benchmark, multispectral aerial image, object counting, remote sensing",

author = "Junyu Gao and Liangliang Zhao and Xuelong Li",

note = "Publisher Copyright: {\textcopyright} 1980-2012 IEEE.",

year = "2024",

doi = "10.1109/TGRS.2024.3356492",

language = "英语",

volume = "62",

pages = "1--14",

journal = "IEEE Transactions on Geoscience and Remote Sensing",

issn = "0196-2892",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - NWPU-MOC

T2 - A Benchmark for Fine-Grained Multicategory Object Counting in Aerial Images

AU - Gao, Junyu

AU - Zhao, Liangliang

AU - Li, Xuelong

PY - 2024

Y1 - 2024

N2 - Object counting is a hot topic in computer vision, which aims to estimate the number of objects in a given image. However, most methods only count objects of a single category for an image, which cannot be applied to scenes that need to count objects with multiple categories simultaneously, especially in aerial scenes. To this end, this article introduces a multicategory object-counting (MOC) task to estimate the numbers of different objects (cars, buildings, ships, etc.) in an aerial image. Considering the absence of a dataset for this task, a large-scale dataset (NWPU-MOC) is collected, consisting of 3416 scenes with a resolution of $1024\times1024$ pixels, and well annotated using 14 fine-grained object categories. Besides, each scene contains RGB and near infrared (NIR) images, of which the NIR spectrum can provide richer characterization information compared with only the RGB spectrum. Based on NWPU-MOC, the article presents a multispectrum, MOC framework, which employs a dual-Attention module to fuse the features of RGB and NIR and subsequently regress multichannel density maps corresponding to each object category. In addition to modeling the dependence between different channels in the density map with each object category, a spatial contrast loss is designed as a penalty for overlapping predictions at the same spatial position. Experimental results demonstrate that the proposed method achieves state-of-The-Art performance compared with some mainstream counting algorithms. The dataset, code, and models are publicly available at https://github.com/lyongo/NWPU-MOC.

AB - Object counting is a hot topic in computer vision, which aims to estimate the number of objects in a given image. However, most methods only count objects of a single category for an image, which cannot be applied to scenes that need to count objects with multiple categories simultaneously, especially in aerial scenes. To this end, this article introduces a multicategory object-counting (MOC) task to estimate the numbers of different objects (cars, buildings, ships, etc.) in an aerial image. Considering the absence of a dataset for this task, a large-scale dataset (NWPU-MOC) is collected, consisting of 3416 scenes with a resolution of $1024\times1024$ pixels, and well annotated using 14 fine-grained object categories. Besides, each scene contains RGB and near infrared (NIR) images, of which the NIR spectrum can provide richer characterization information compared with only the RGB spectrum. Based on NWPU-MOC, the article presents a multispectrum, MOC framework, which employs a dual-Attention module to fuse the features of RGB and NIR and subsequently regress multichannel density maps corresponding to each object category. In addition to modeling the dependence between different channels in the density map with each object category, a spatial contrast loss is designed as a penalty for overlapping predictions at the same spatial position. Experimental results demonstrate that the proposed method achieves state-of-The-Art performance compared with some mainstream counting algorithms. The dataset, code, and models are publicly available at https://github.com/lyongo/NWPU-MOC.

KW - Benchmark

KW - multispectral aerial image

KW - object counting

KW - remote sensing

UR - http://www.scopus.com/inward/record.url?scp=85182927131&partnerID=8YFLogxK

U2 - 10.1109/TGRS.2024.3356492

DO - 10.1109/TGRS.2024.3356492

M3 - 文章

AN - SCOPUS:85182927131

SN - 0196-2892

VL - 62

SP - 1

EP - 14

JO - IEEE Transactions on Geoscience and Remote Sensing

JF - IEEE Transactions on Geoscience and Remote Sensing

M1 - 5606614

ER -

NWPU-MOC: A Benchmark for Fine-Grained Multicategory Object Counting in Aerial Images

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this