Specificity-preserving RGB-D Saliency Detection

Tao Zhou; Huazhu Fu; Geng Chen; Yi Zhou; Deng Ping Fan; Ling Shao

doi:10.1109/ICCV48922.2021.00464

Specificity-preserving RGB-D Saliency Detection

Tao Zhou, Huazhu Fu, Geng Chen, Yi Zhou, Deng Ping Fan, Ling Shao

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

184 Scopus citations

Abstract

RGB-D saliency detection has attracted increasing attention, due to its effectiveness and the fact that depth cues can now be conveniently captured. Existing works often focus on learning a shared representation through various fusion strategies, with few methods explicitly considering how to preserve modality-specific characteristics. In this paper, taking a new perspective, we propose a specificity-preserving network (SP-Net) for RGB-D saliency detection, which benefits saliency detection performance by exploring both the shared information and modality-specific properties (e.g., specificity). Specifically, two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps. A cross-enhanced integration module (CIM) is proposed to fuse cross-modal features in the shared learning network, which are then propagated to the next layer for integrating cross-level information. Besides, we propose a multi-modal feature aggregation (MFA) module to integrate the modality-specific features from each individual decoder into the shared decoder, which can provide rich complementary multi-modal information to boost the saliency detection performance. Further, a skip connection is used to combine hierarchical features between the encoder and decoder layers. Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods. Code is available at: https://github.com/taozh2017/SPNet.

Original language	English
Title of host publication	Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	4661-4671
Number of pages	11
ISBN (Electronic)	9781665428125
DOIs	https://doi.org/10.1109/ICCV48922.2021.00464
State	Published - 2021
Externally published	Yes
Event	18th IEEE/CVF International Conference on Computer Vision, ICCV 2021 - Virtual, Online, Canada Duration: 11 Oct 2021 → 17 Oct 2021

Publication series

Name	Proceedings of the IEEE International Conference on Computer Vision
ISSN (Print)	1550-5499

Conference

Conference	18th IEEE/CVF International Conference on Computer Vision, ICCV 2021
Country/Territory	Canada
City	Virtual, Online
Period	11/10/21 → 17/10/21

Access to Document

10.1109/ICCV48922.2021.00464

Cite this

Zhou, T., Fu, H., Chen, G., Zhou, Y., Fan, D. P., & Shao, L. (2021). Specificity-preserving RGB-D Saliency Detection. In Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021 (pp. 4661-4671). (Proceedings of the IEEE International Conference on Computer Vision). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCV48922.2021.00464

@inproceedings{a1a193a8f0ec46a18a4f942bfb2fcdcc,

title = "Specificity-preserving RGB-D Saliency Detection",

abstract = "RGB-D saliency detection has attracted increasing attention, due to its effectiveness and the fact that depth cues can now be conveniently captured. Existing works often focus on learning a shared representation through various fusion strategies, with few methods explicitly considering how to preserve modality-specific characteristics. In this paper, taking a new perspective, we propose a specificity-preserving network (SP-Net) for RGB-D saliency detection, which benefits saliency detection performance by exploring both the shared information and modality-specific properties (e.g., specificity). Specifically, two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps. A cross-enhanced integration module (CIM) is proposed to fuse cross-modal features in the shared learning network, which are then propagated to the next layer for integrating cross-level information. Besides, we propose a multi-modal feature aggregation (MFA) module to integrate the modality-specific features from each individual decoder into the shared decoder, which can provide rich complementary multi-modal information to boost the saliency detection performance. Further, a skip connection is used to combine hierarchical features between the encoder and decoder layers. Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods. Code is available at: https://github.com/taozh2017/SPNet.",

author = "Tao Zhou and Huazhu Fu and Geng Chen and Yi Zhou and Fan, {Deng Ping} and Ling Shao",

note = "Publisher Copyright: {\textcopyright} 2021 IEEE; 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021 ; Conference date: 11-10-2021 Through 17-10-2021",

year = "2021",

doi = "10.1109/ICCV48922.2021.00464",

language = "英语",

series = "Proceedings of the IEEE International Conference on Computer Vision",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "4661--4671",

booktitle = "Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021",

}

Zhou, T, Fu, H, Chen, G, Zhou, Y, Fan, DP & Shao, L 2021, Specificity-preserving RGB-D Saliency Detection. in Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021. Proceedings of the IEEE International Conference on Computer Vision, Institute of Electrical and Electronics Engineers Inc., pp. 4661-4671, 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021, Virtual, Online, Canada, 11/10/21. https://doi.org/10.1109/ICCV48922.2021.00464

Specificity-preserving RGB-D Saliency Detection. / Zhou, Tao; Fu, Huazhu; Chen, Geng et al.
Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021. Institute of Electrical and Electronics Engineers Inc., 2021. p. 4661-4671 (Proceedings of the IEEE International Conference on Computer Vision).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Specificity-preserving RGB-D Saliency Detection

AU - Zhou, Tao

AU - Fu, Huazhu

AU - Chen, Geng

AU - Zhou, Yi

AU - Fan, Deng Ping

AU - Shao, Ling

PY - 2021

Y1 - 2021

N2 - RGB-D saliency detection has attracted increasing attention, due to its effectiveness and the fact that depth cues can now be conveniently captured. Existing works often focus on learning a shared representation through various fusion strategies, with few methods explicitly considering how to preserve modality-specific characteristics. In this paper, taking a new perspective, we propose a specificity-preserving network (SP-Net) for RGB-D saliency detection, which benefits saliency detection performance by exploring both the shared information and modality-specific properties (e.g., specificity). Specifically, two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps. A cross-enhanced integration module (CIM) is proposed to fuse cross-modal features in the shared learning network, which are then propagated to the next layer for integrating cross-level information. Besides, we propose a multi-modal feature aggregation (MFA) module to integrate the modality-specific features from each individual decoder into the shared decoder, which can provide rich complementary multi-modal information to boost the saliency detection performance. Further, a skip connection is used to combine hierarchical features between the encoder and decoder layers. Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods. Code is available at: https://github.com/taozh2017/SPNet.

AB - RGB-D saliency detection has attracted increasing attention, due to its effectiveness and the fact that depth cues can now be conveniently captured. Existing works often focus on learning a shared representation through various fusion strategies, with few methods explicitly considering how to preserve modality-specific characteristics. In this paper, taking a new perspective, we propose a specificity-preserving network (SP-Net) for RGB-D saliency detection, which benefits saliency detection performance by exploring both the shared information and modality-specific properties (e.g., specificity). Specifically, two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps. A cross-enhanced integration module (CIM) is proposed to fuse cross-modal features in the shared learning network, which are then propagated to the next layer for integrating cross-level information. Besides, we propose a multi-modal feature aggregation (MFA) module to integrate the modality-specific features from each individual decoder into the shared decoder, which can provide rich complementary multi-modal information to boost the saliency detection performance. Further, a skip connection is used to combine hierarchical features between the encoder and decoder layers. Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods. Code is available at: https://github.com/taozh2017/SPNet.

UR - http://www.scopus.com/inward/record.url?scp=85127819115&partnerID=8YFLogxK

U2 - 10.1109/ICCV48922.2021.00464

DO - 10.1109/ICCV48922.2021.00464

M3 - 会议稿件

AN - SCOPUS:85127819115

T3 - Proceedings of the IEEE International Conference on Computer Vision

SP - 4661

EP - 4671

BT - Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021

Y2 - 11 October 2021 through 17 October 2021

ER -

Specificity-preserving RGB-D Saliency Detection

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this