AMDFNet: Adaptive multi-level deformable fusion network for RGB-D saliency detection

Fei Li; Jiangbin Zheng; Yuan fang Zhang; Nian Liu; Wenjing Jia

doi:10.1016/j.neucom.2021.08.116

AMDFNet: Adaptive multi-level deformable fusion network for RGB-D saliency detection

Fei Li, Jiangbin Zheng, Yuan fang Zhang, Nian Liu, Wenjing Jia

软件学院

科研成果: 期刊稿件 › 文章 › 同行评审

7 引用（Scopus）

摘要

Effective exploration of useful contextual information in multi-modal images is an essential task in salient object detection. Nevertheless, the existing methods based on the early-fusion or the late-fusion schemes cannot address this problem as they are unable to effectively resolve the distribution gap and information loss. In this paper, we propose an adaptive multi-level deformable fusion network (AMDFNet) to exploit the cross-modality information. We use a cross-modality deformable convolution module to dynamically adjust the boundaries of salient objects by exploring the extra input from another modality. This enables incorporating the existing features and propagating more contexts so as to strengthen the model's ability to perceiving scenes. To accurately refine the predicted maps, a multi-scaled feature refinement module is proposed to enhance the intermediate features with multi-level prediction in the decoder part. Furthermore, we introduce a selective cross-modality attention module in the fusion process to exploit the attention mechanism. This module captures dense long-range cross-modality dependencies from a multi-modal hierarchical feature's perspective. This strategy enables the network to select more informative details and suppress the contamination caused by the negative depth maps. Experimental results on eight benchmark datasets demonstrate the effectiveness of the components in our proposed model, as well as the overall saliency model.

源语言	英语
页（从-至）	141-156
页数	16
期刊	Neurocomputing
卷	465
DOI	https://doi.org/10.1016/j.neucom.2021.08.116
出版状态	已出版 - 20 11月 2021

访问文件

10.1016/j.neucom.2021.08.116

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{981b5d39b1b742689ba2c0bd2b8542d6,

title = "AMDFNet: Adaptive multi-level deformable fusion network for RGB-D saliency detection",

abstract = "Effective exploration of useful contextual information in multi-modal images is an essential task in salient object detection. Nevertheless, the existing methods based on the early-fusion or the late-fusion schemes cannot address this problem as they are unable to effectively resolve the distribution gap and information loss. In this paper, we propose an adaptive multi-level deformable fusion network (AMDFNet) to exploit the cross-modality information. We use a cross-modality deformable convolution module to dynamically adjust the boundaries of salient objects by exploring the extra input from another modality. This enables incorporating the existing features and propagating more contexts so as to strengthen the model's ability to perceiving scenes. To accurately refine the predicted maps, a multi-scaled feature refinement module is proposed to enhance the intermediate features with multi-level prediction in the decoder part. Furthermore, we introduce a selective cross-modality attention module in the fusion process to exploit the attention mechanism. This module captures dense long-range cross-modality dependencies from a multi-modal hierarchical feature's perspective. This strategy enables the network to select more informative details and suppress the contamination caused by the negative depth maps. Experimental results on eight benchmark datasets demonstrate the effectiveness of the components in our proposed model, as well as the overall saliency model.",

keywords = "Cross-modality deformable convolution, Multi-modality fusion, RGB-D, Salient object detection",

author = "Fei Li and Jiangbin Zheng and Zhang, {Yuan fang} and Nian Liu and Wenjing Jia",

note = "Publisher Copyright: {\textcopyright} 2021 Elsevier B.V.",

year = "2021",

month = nov,

day = "20",

doi = "10.1016/j.neucom.2021.08.116",

language = "英语",

volume = "465",

pages = "141--156",

journal = "Neurocomputing",

issn = "0925-2312",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - AMDFNet

T2 - Adaptive multi-level deformable fusion network for RGB-D saliency detection

AU - Li, Fei

AU - Zheng, Jiangbin

AU - Zhang, Yuan fang

AU - Liu, Nian

AU - Jia, Wenjing

PY - 2021/11/20

Y1 - 2021/11/20

N2 - Effective exploration of useful contextual information in multi-modal images is an essential task in salient object detection. Nevertheless, the existing methods based on the early-fusion or the late-fusion schemes cannot address this problem as they are unable to effectively resolve the distribution gap and information loss. In this paper, we propose an adaptive multi-level deformable fusion network (AMDFNet) to exploit the cross-modality information. We use a cross-modality deformable convolution module to dynamically adjust the boundaries of salient objects by exploring the extra input from another modality. This enables incorporating the existing features and propagating more contexts so as to strengthen the model's ability to perceiving scenes. To accurately refine the predicted maps, a multi-scaled feature refinement module is proposed to enhance the intermediate features with multi-level prediction in the decoder part. Furthermore, we introduce a selective cross-modality attention module in the fusion process to exploit the attention mechanism. This module captures dense long-range cross-modality dependencies from a multi-modal hierarchical feature's perspective. This strategy enables the network to select more informative details and suppress the contamination caused by the negative depth maps. Experimental results on eight benchmark datasets demonstrate the effectiveness of the components in our proposed model, as well as the overall saliency model.

AB - Effective exploration of useful contextual information in multi-modal images is an essential task in salient object detection. Nevertheless, the existing methods based on the early-fusion or the late-fusion schemes cannot address this problem as they are unable to effectively resolve the distribution gap and information loss. In this paper, we propose an adaptive multi-level deformable fusion network (AMDFNet) to exploit the cross-modality information. We use a cross-modality deformable convolution module to dynamically adjust the boundaries of salient objects by exploring the extra input from another modality. This enables incorporating the existing features and propagating more contexts so as to strengthen the model's ability to perceiving scenes. To accurately refine the predicted maps, a multi-scaled feature refinement module is proposed to enhance the intermediate features with multi-level prediction in the decoder part. Furthermore, we introduce a selective cross-modality attention module in the fusion process to exploit the attention mechanism. This module captures dense long-range cross-modality dependencies from a multi-modal hierarchical feature's perspective. This strategy enables the network to select more informative details and suppress the contamination caused by the negative depth maps. Experimental results on eight benchmark datasets demonstrate the effectiveness of the components in our proposed model, as well as the overall saliency model.

KW - Cross-modality deformable convolution

KW - Multi-modality fusion

KW - RGB-D

KW - Salient object detection

UR - http://www.scopus.com/inward/record.url?scp=85114926244&partnerID=8YFLogxK

U2 - 10.1016/j.neucom.2021.08.116

DO - 10.1016/j.neucom.2021.08.116

M3 - 文章

AN - SCOPUS:85114926244

SN - 0925-2312

VL - 465

SP - 141

EP - 156

JO - Neurocomputing

JF - Neurocomputing

ER -

AMDFNet: Adaptive multi-level deformable fusion network for RGB-D saliency detection

摘要

访问文件

其它文件与链接

指纹

引用此