TY - JOUR
T1 - Dynamic proxy domain generalizes the crowd localization by better binary segmentation
AU - Gao, Junyu
AU - Zhang, Da
AU - Wang, Qiyu
AU - Zhao, Zhiyuan
AU - Li, Xuelong
N1 - Publisher Copyright:
© 2025
PY - 2026/4
Y1 - 2026/4
N2 - Crowd localization aims to predict the precise location of each instance within an image. Current advanced methods utilize pixel-wise binary classification to address the congested prediction, where pixel-level thresholds convert prediction confidence into binary values for identifying pedestrian heads. Due to the extremely variable contents, counts, and scales in crowd scenes, the confidence-threshold learner is fragile and lacks generalization when encountering domain shifts. Moreover, in most cases, the target domain is unknown during training. Therefore, it is crucial to explore how to enhance the generalization of the confidence-threshold locator to latent target domains. In this paper, we propose a Dynamic Proxy Domain (DPD) method to improve the generalization of the learner under domain shifts. Concretely, informed by the theoretical analysis of the upper bound of generalization error risk for a binary classifier on latent target domains, we introduce a generated proxy domain to facilitate generalization. Then, based on this theory, we design a DPD algorithm consisting of a training paradigm and a proxy domain generator to enhance the domain generalization of the confidence-threshold learner. Additionally, we apply our method to five types of domain shift scenarios, demonstrating its effectiveness in generalizing crowd localization. Our code is available at https://github.com/zhangda1018/DPD.
AB - Crowd localization aims to predict the precise location of each instance within an image. Current advanced methods utilize pixel-wise binary classification to address the congested prediction, where pixel-level thresholds convert prediction confidence into binary values for identifying pedestrian heads. Due to the extremely variable contents, counts, and scales in crowd scenes, the confidence-threshold learner is fragile and lacks generalization when encountering domain shifts. Moreover, in most cases, the target domain is unknown during training. Therefore, it is crucial to explore how to enhance the generalization of the confidence-threshold locator to latent target domains. In this paper, we propose a Dynamic Proxy Domain (DPD) method to improve the generalization of the learner under domain shifts. Concretely, informed by the theoretical analysis of the upper bound of generalization error risk for a binary classifier on latent target domains, we introduce a generated proxy domain to facilitate generalization. Then, based on this theory, we design a DPD algorithm consisting of a training paradigm and a proxy domain generator to enhance the domain generalization of the confidence-threshold learner. Additionally, we apply our method to five types of domain shift scenarios, demonstrating its effectiveness in generalizing crowd localization. Our code is available at https://github.com/zhangda1018/DPD.
KW - Binary segmentation
KW - Crowd localization
KW - Domain adaptation
KW - Dynamic proxy domain
UR - https://www.scopus.com/pages/publications/105018099609
U2 - 10.1016/j.patcog.2025.112481
DO - 10.1016/j.patcog.2025.112481
M3 - 文章
AN - SCOPUS:105018099609
SN - 0031-3203
VL - 172
JO - Pattern Recognition
JF - Pattern Recognition
M1 - 112481
ER -