Abstract
Visible-infrared person re-identification (VI-ReID) is a challenging task in computer vision that aims to match individuals across images captured in visible and infrared modalities. Existing approaches typically focus on either image-level or feature-level alignment, yet often struggle to effectively bridge the modality gap. In this paper, we propose a novel frequency-aware representation learning framework that leverages the complementary properties of visible and infrared images in the frequency domain to generate diverse and informative embeddings, thereby reducing cross-modal discrepancies. Specifically, we first extract low- and high-frequency features from input representations, guided by adaptively decoupled spectral components. These features are then refined via a bidirectional modulation operator that promotes interaction between frequency components. Furthermore, we design a multistage knowledge fusion module to enhance the complementarity between global structures and fine-grained details across multiple frequency scales. Extensive experiments on public benchmark datasets demonstrate that our method significantly outperforms state-of-the-art approaches, validating its effectiveness and generalization capability in complex cross-modal scenarios.
| Original language | English |
|---|---|
| Article number | 105526 |
| Journal | Digital Signal Processing: A Review Journal |
| Volume | 168 |
| DOIs | |
| State | Published - Jan 2026 |
Keywords
- Frequency mining
- Multistage knowledge fusion
- Visible-infrared person re-identification
Fingerprint
Dive into the research topics of 'Visible-infrared person re-identification via adaptive frequency mining and embedding'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver