TY - JOUR
T1 - Adjustable Visible and Infrared Image Fusion
AU - Wu, Boxiong
AU - Nie, Jiangtao
AU - Wei, Wei
AU - Zhang, Lei
AU - Zhang, Yanning
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - The visible and infrared image fusion (VIF) method aims to utilize the complementary information between these two modalities to synthesize a new image containing richer information. Although it has been extensively studied, the synthesized image that has the best visual results is difficult to reach consensus since users have different opinions. To address this problem, we propose an adjustable VIF framework termed AdjFusion, which introduces a global controlling coefficient into VIF to enforce it can interact with users. Within AdjFusion, a semantic-aware modulation module is proposed to transform the global controlling coefficient into a semantic-aware controlling coefficient, which provides pixel-wise guidance for AdjFusion considering both interactivity and semantic information within visible and infrared images. In addition, the introduced global controlling coefficient not only can be utilized as an external interface for interaction with users but also can be easily customized by the downstream tasks (e.g., VIF-based detection and segmentation), which can help to select the best fusion result for the downstream tasks. Taking advantage of this, we further propose a lightweight adaptation module for AdjFusion to learn the global controlling coefficient to be suitable for the downstream tasks better. Experimental results demonstrate the proposed AdjFusion can 1) provide ways to dynamically synthesize images to meet the diverse demands of users; and 2) outperform the previous state-of-the-art methods on both VIF-based detection and segmentation tasks, with the constructed lightweight adaptation method. Our code will be released after accepted at https://github.com/BearTo2/AdjFusion.
AB - The visible and infrared image fusion (VIF) method aims to utilize the complementary information between these two modalities to synthesize a new image containing richer information. Although it has been extensively studied, the synthesized image that has the best visual results is difficult to reach consensus since users have different opinions. To address this problem, we propose an adjustable VIF framework termed AdjFusion, which introduces a global controlling coefficient into VIF to enforce it can interact with users. Within AdjFusion, a semantic-aware modulation module is proposed to transform the global controlling coefficient into a semantic-aware controlling coefficient, which provides pixel-wise guidance for AdjFusion considering both interactivity and semantic information within visible and infrared images. In addition, the introduced global controlling coefficient not only can be utilized as an external interface for interaction with users but also can be easily customized by the downstream tasks (e.g., VIF-based detection and segmentation), which can help to select the best fusion result for the downstream tasks. Taking advantage of this, we further propose a lightweight adaptation module for AdjFusion to learn the global controlling coefficient to be suitable for the downstream tasks better. Experimental results demonstrate the proposed AdjFusion can 1) provide ways to dynamically synthesize images to meet the diverse demands of users; and 2) outperform the previous state-of-the-art methods on both VIF-based detection and segmentation tasks, with the constructed lightweight adaptation method. Our code will be released after accepted at https://github.com/BearTo2/AdjFusion.
KW - adjustable
KW - attention mechanism
KW - dynamically synthesize
KW - Infrared and visible image fusion
KW - lightweight adaptation module
UR - http://www.scopus.com/inward/record.url?scp=85202749177&partnerID=8YFLogxK
U2 - 10.1109/TCSVT.2024.3449638
DO - 10.1109/TCSVT.2024.3449638
M3 - 文章
AN - SCOPUS:85202749177
SN - 1051-8215
VL - 34
SP - 13463
EP - 13477
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 12
ER -