Potential region attention network for RGB-D salient object detection

Dawei Song, Yuan Yuan, Xuelong Li

科研成果: 期刊稿件文章同行评审

摘要

Many encouraging investigations have already been conducted on RGB-D salient object detection (SOD). However, most of these methods are limited in mining single-modal features and have not fully utilized the appropriate complementarity of cross-modal features. To alleviate the issues, this study designs a potential region attention network (PRANet) for RGB-D SOD. Specifically, the PRANet adopts Swin Transformer as its backbone to efficiently obtain two-stream features. Besides, a potential multi-scale attention module (PMAM) is equipped at the highest level of the encoder, which is beneficial for mining intra-modal information and enhancing feature expression. More importantly, a potential region attention module (PRAM) is designed to properly utilize the complementarity of cross-modal information, which adopts a potential region attention to guide two-stream feature fusion. In addition, by refining and correcting cross-layer features, a feature refinement fusion module (FRFM) is designed to strengthen the cross-layer information transmission between the encoder and decoder. Finally, the multi-side supervision is used during the training phase. Sufficient experimental results on 6 RGB-D SOD datasets indicate that our PRANet has achieved outstanding performance and is superior to 15 representative methods.

源语言英语
文章编号107620
期刊Neural Networks
190
DOI
出版状态已出版 - 10月 2025

指纹

探究 'Potential region attention network for RGB-D salient object detection' 的科研主题。它们共同构成独一无二的指纹。

引用此