TY - JOUR
T1 - Cache-aided cross-modal correlation correction for unsupervised cross-domain text-based person search
AU - Niu, Kai
AU - Zhao, Qinzi
AU - Chen, Jiahui
AU - Zhang, Yanning
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2026/4
Y1 - 2026/4
N2 - Unsupervised Cross-domain Text-Based Person Search (UC-TBPS) has to face not only the modality heterogeneity, but also the cross-domain difficulty in more practical surveillance circumstances. However, few research has focused on the cross-domain difficulty, which may severely hinder the real-world applications of TBPS. In this paper, we propose the Test-time Cache-aided Cross-modal Correlation Correction (TC4) method, which acts as a pioneer for especially addressing the UC-TBPS task by novel test-time re-ranking. Firstly, we conduct clustering inside the pedestrian image gallery, and construct the reward and penalty caches based on these clustering centers, to store more sentences relays for alleviating the cross-domain problem. Secondly, we calculate the reward and penalty values to refine the appropriately located image-sentence correlation positions under the guidance of these two caches, respectively. Finally, the refined image-sentence correlations are used to re-rank the original retrieval results. As a test-time re-ranking approach, our TC4 method does not require fine-tuning in the target domain, and can obtain retrieval performance improvements with negligible additional overheads. Extensive experiments and analyses on the tasks of UC-TBPS as well as unsupervised cross-domain image-text matching can validate the effectiveness and generalization capacities of our proposed TC4 solution.
AB - Unsupervised Cross-domain Text-Based Person Search (UC-TBPS) has to face not only the modality heterogeneity, but also the cross-domain difficulty in more practical surveillance circumstances. However, few research has focused on the cross-domain difficulty, which may severely hinder the real-world applications of TBPS. In this paper, we propose the Test-time Cache-aided Cross-modal Correlation Correction (TC4) method, which acts as a pioneer for especially addressing the UC-TBPS task by novel test-time re-ranking. Firstly, we conduct clustering inside the pedestrian image gallery, and construct the reward and penalty caches based on these clustering centers, to store more sentences relays for alleviating the cross-domain problem. Secondly, we calculate the reward and penalty values to refine the appropriately located image-sentence correlation positions under the guidance of these two caches, respectively. Finally, the refined image-sentence correlations are used to re-rank the original retrieval results. As a test-time re-ranking approach, our TC4 method does not require fine-tuning in the target domain, and can obtain retrieval performance improvements with negligible additional overheads. Extensive experiments and analyses on the tasks of UC-TBPS as well as unsupervised cross-domain image-text matching can validate the effectiveness and generalization capacities of our proposed TC4 solution.
KW - Cross-domain adaptation
KW - Cross-modal retrieval
KW - Person search
KW - Re-ranking
UR - https://www.scopus.com/pages/publications/105017849694
U2 - 10.1016/j.patcog.2025.112521
DO - 10.1016/j.patcog.2025.112521
M3 - 文章
AN - SCOPUS:105017849694
SN - 0031-3203
VL - 172
JO - Pattern Recognition
JF - Pattern Recognition
M1 - 112521
ER -