TY - JOUR
T1 - Less Is More
T2 - Infrared and Visible Images Fusion via Semantic-Guided Mixture of Multi-Feature Experts
AU - Xing, Yinghui
AU - Niu, Zhilong
AU - Yang, Shuo
AU - Zhang, Shizhou
AU - Zhang, Yanning
N1 - Publisher Copyright:
© 2026 IEEE.
PY - 2026
Y1 - 2026
N2 - Infrared (IR) and visible image fusion (IVIF) has become prevalent in recent years. By leveraging the complementary characteristics of infrared and visible images, we can obtain visually-appealing fused images, which further facilitate subsequent scene understanding and object detection from day to night. Integrating complementary information while simultaneously eliminating redundancy is a crucial challenge in fusion. Most of available deep learning based methods, after being trained, execute static inference on all pairs of infrared and visible images. They struggle to effectively handle redundancy of modality across diverse scenarios, resulting in superfluous information such as thermal noise in infrared images and artifacts in visible images. In this paper, we propose an IVIF method based on a semantic-guided mixture of multi-feature experts, where multiple types of features are extracted, each assigned to a dedicated expert network specialized in processing a specific type of features. Through an expert routing mechanism, these experts are chosen dynamically, ensuring that the most significant features of each image modality are routed to a specific group of experts. In order to align fusion task with subsequent semantic segmentation task, we introduce a segmentation head to semantically guide the selection of the complementary features. Extensive experiments on five infrared and visible image fusion and segmentation benchmarks demonstrate the effectiveness of our method, both for image fusion and subsequent semantic segmentation tasks. The code will be available at https://github.com/ZhilongNiu/SD-MoMFE
AB - Infrared (IR) and visible image fusion (IVIF) has become prevalent in recent years. By leveraging the complementary characteristics of infrared and visible images, we can obtain visually-appealing fused images, which further facilitate subsequent scene understanding and object detection from day to night. Integrating complementary information while simultaneously eliminating redundancy is a crucial challenge in fusion. Most of available deep learning based methods, after being trained, execute static inference on all pairs of infrared and visible images. They struggle to effectively handle redundancy of modality across diverse scenarios, resulting in superfluous information such as thermal noise in infrared images and artifacts in visible images. In this paper, we propose an IVIF method based on a semantic-guided mixture of multi-feature experts, where multiple types of features are extracted, each assigned to a dedicated expert network specialized in processing a specific type of features. Through an expert routing mechanism, these experts are chosen dynamically, ensuring that the most significant features of each image modality are routed to a specific group of experts. In order to align fusion task with subsequent semantic segmentation task, we introduce a segmentation head to semantically guide the selection of the complementary features. Extensive experiments on five infrared and visible image fusion and segmentation benchmarks demonstrate the effectiveness of our method, both for image fusion and subsequent semantic segmentation tasks. The code will be available at https://github.com/ZhilongNiu/SD-MoMFE
KW - Image fusion
KW - infrared images
KW - mixture-of-expert
UR - https://www.scopus.com/pages/publications/105034427705
U2 - 10.1109/TIP.2026.3675500
DO - 10.1109/TIP.2026.3675500
M3 - 文章
AN - SCOPUS:105034427705
SN - 1057-7149
VL - 35
SP - 3381
EP - 3394
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
ER -