TY - JOUR
T1 - MBIAN
T2 - Multi-level bilateral interactive attention network for multi-modal image processing
AU - Sun, Kai
AU - Zhang, Jiangshe
AU - Wang, Jialin
AU - Xu, Shuang
AU - Zhang, Chunxia
AU - Hu, Junying
N1 - Publisher Copyright:
© 2023 Elsevier Ltd
PY - 2023/11/30
Y1 - 2023/11/30
N2 - Convolutional neural networks (CNNs) have achieved impressive success in the multi-modal image processing (MIP) area. However, many existing CNN approaches fuse the features of the target and guidance images only once, which may cause a loss of information. To alleviate this problem, we present a multi-level bilateral interactive attention network (MBIAN) to fuse the features of the target and guidance images by their progressive interaction at different levels. Concretely, for each level, a bilateral interactive attention block (BIAB) is proposed to fuse the information of target and guidance images and refine their features. As the core component of our BIAB, a novel bilateral interactive attention layer (BIAL) is designed, where target and guidance images can mutually determine the attention weights. In addition, in each BIAB, long and short local shortcuts are employed to further facilitate the flow of information. Numerical experiments are conducted for three different problems, including panchromatic guided multi-spectral image super-resolution, near-infrared guided RGB image denoising, and flash-guided no-flash image denoising. The results demonstrate the versatility and superiority of MBIAN in terms of quantitative metrics and visual inspection, against 14 popular and state-of-the-art methods.
AB - Convolutional neural networks (CNNs) have achieved impressive success in the multi-modal image processing (MIP) area. However, many existing CNN approaches fuse the features of the target and guidance images only once, which may cause a loss of information. To alleviate this problem, we present a multi-level bilateral interactive attention network (MBIAN) to fuse the features of the target and guidance images by their progressive interaction at different levels. Concretely, for each level, a bilateral interactive attention block (BIAB) is proposed to fuse the information of target and guidance images and refine their features. As the core component of our BIAB, a novel bilateral interactive attention layer (BIAL) is designed, where target and guidance images can mutually determine the attention weights. In addition, in each BIAB, long and short local shortcuts are employed to further facilitate the flow of information. Numerical experiments are conducted for three different problems, including panchromatic guided multi-spectral image super-resolution, near-infrared guided RGB image denoising, and flash-guided no-flash image denoising. The results demonstrate the versatility and superiority of MBIAN in terms of quantitative metrics and visual inspection, against 14 popular and state-of-the-art methods.
KW - Bilateral interactive attention layer
KW - Long and short local shortcuts
KW - Multi-level bilateral interactive attention network
KW - Multi-modal image processing
UR - http://www.scopus.com/inward/record.url?scp=85162135205&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2023.120733
DO - 10.1016/j.eswa.2023.120733
M3 - 文章
AN - SCOPUS:85162135205
SN - 0957-4174
VL - 231
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 120733
ER -