Skip to main navigation Skip to search Skip to main content

Infrared and Visible Image Fusion Using Ternary Cycle-Consistent Adversarial Networks

  • Kaiyang Ge
  • , Xue Wang
  • , Shuaiteng Han
  • , Guoqing Zhou
  • , Qing Wang
  • Northwestern Polytechnical University Xian

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Infrared and visible image fusion (IVIF) integrates complementary information from distinct spectral bands to augment image quality and scene understanding, such as object detection. Exsiting methods generally assume identical modality availability during training and testing. To achieve accurate and robust object detection in real-world applications, it is necessary to consider modality drop scenarioes. This paper proposes a novel framework leveraging triadic training data to establish dual bidirectional mappings between source modalities and the fusion domain. The learned mappings include two core generative paths for synthesizing fused images from infrared/visible inputs (infrared → fused, visible → fused), and two auxiliary reconstruciton paths enforcing semantic consistency through inverse translations (infrared ← fused, visible ← fused). To address the under-constraint issue of these mappings across infrared and visible modalities, except for the adversarial loss, we introduce: (i) the ternary cycle-consitency loss enforcing mutual coherence among the dual bidirectional mappings; and (ii) the hybrid supervision loss combining a fusion loss ensuring pixel-wise fidelity to ground truth and a reconstruction loss regularizing auxiliary mappings. To evaluate the performance of the proposed method, we constructed a novel dataset for IVIF and object detection, named DroneCar, which is collected based on an unmanned aerial vehicle (UAV) platform. Experimental results on both DroneCar and three public datasets demonstrate that the proposed method outperforms existing state-of-the-art approaches, especially improving the downstream object detection accuracy of unimodal networks when compared to modality fusion methods across multiple IVIF datasets.

Original languageEnglish
Title of host publicationProceedings - 2025 International Conference on Virtual Reality and Visualization, ICVRV 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages697-702
Number of pages6
ISBN (Electronic)9798331556297
DOIs
StatePublished - 2025
Event2025 International Conference on Virtual Reality and Visualization, ICVRV 2025 - Bogota, Colombia
Duration: 19 Dec 202521 Dec 2025

Publication series

NameProceedings - 2025 International Conference on Virtual Reality and Visualization, ICVRV 2025

Conference

Conference2025 International Conference on Virtual Reality and Visualization, ICVRV 2025
Country/TerritoryColombia
CityBogota
Period19/12/2521/12/25

Keywords

  • Generative Adversarial Network
  • Image Fusion
  • Modality Drop
  • Multimodal Learning
  • Object Detection

Fingerprint

Dive into the research topics of 'Infrared and Visible Image Fusion Using Ternary Cycle-Consistent Adversarial Networks'. Together they form a unique fingerprint.

Cite this