RGB-D Saliency Detection via Cascaded Mutual Information Minimization

Jing Zhang; Deng Ping Fan; Yuchao Dai; Xin Yu; Yiran Zhong; Nick Barnes; Ling Shao

doi:10.1109/ICCV48922.2021.00430

RGB-D Saliency Detection via Cascaded Mutual Information Minimization

Jing Zhang, Deng Ping Fan, Yuchao Dai, Xin Yu, Yiran Zhong, Nick Barnes, Ling Shao

电子信息学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

99 引用（Scopus）

摘要

Existing RGB-D saliency detection models do not explicitly encourage RGB and depth to achieve effective multi-modal learning. In this paper, we introduce a novel multistage cascaded learning framework via mutual information minimization to explicitly model the multi-modal information between RGB image and depth data. Specifically, we first map the feature of each mode to a lower dimensional feature vector, and adopt mutual information minimization as a regularizer to reduce the redundancy between appearance features from RGB and geometric features from depth. We then perform multi-stage cascaded learning to impose the mutual information minimization constraint at every stage of the network. Extensive experiments on benchmark RGB-D saliency datasets illustrate the effectiveness of our framework. Further, to prosper the development of this field, we contribute the largest (7× larger than NJU2K) COME15K dataset, which contains 15,625 image pairs with high quality polygon-/scribble-/object-/instance-/rank-level annotations. Based on these rich labels, we additionally construct four new benchmarks with strong baselines and observe some interesting phenomena, which can motivate future model design. Source code and dataset are available at https://github.com/JingZhang617/cascaded_rgbd_sod.

源语言	英语
主期刊名	Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021
出版商	Institute of Electrical and Electronics Engineers Inc.
页	4318-4327
页数	10
ISBN（电子版）	9781665428125
DOI	https://doi.org/10.1109/ICCV48922.2021.00430
出版状态	已出版 - 2021
活动	18th IEEE/CVF International Conference on Computer Vision, ICCV 2021 - Virtual, Online, 加拿大期限: 11 10月 2021 → 17 10月 2021

出版系列

姓名	Proceedings of the IEEE International Conference on Computer Vision
ISSN（印刷版）	1550-5499

会议

会议	18th IEEE/CVF International Conference on Computer Vision, ICCV 2021
国家/地区	加拿大
市	Virtual, Online
时期	11/10/21 → 17/10/21

访问文件

10.1109/ICCV48922.2021.00430

其它文件与链接

链接到 Scopus 的出版物

引用此

Zhang, J., Fan, D. P., Dai, Y., Yu, X., Zhong, Y., Barnes, N., & Shao, L. (2021). RGB-D Saliency Detection via Cascaded Mutual Information Minimization. 在 Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021 (页码 4318-4327). (Proceedings of the IEEE International Conference on Computer Vision). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCV48922.2021.00430

@inproceedings{ec28b89d57904e73b8de70c8a9e11bf7,

title = "RGB-D Saliency Detection via Cascaded Mutual Information Minimization",

abstract = "Existing RGB-D saliency detection models do not explicitly encourage RGB and depth to achieve effective multi-modal learning. In this paper, we introduce a novel multistage cascaded learning framework via mutual information minimization to explicitly model the multi-modal information between RGB image and depth data. Specifically, we first map the feature of each mode to a lower dimensional feature vector, and adopt mutual information minimization as a regularizer to reduce the redundancy between appearance features from RGB and geometric features from depth. We then perform multi-stage cascaded learning to impose the mutual information minimization constraint at every stage of the network. Extensive experiments on benchmark RGB-D saliency datasets illustrate the effectiveness of our framework. Further, to prosper the development of this field, we contribute the largest (7× larger than NJU2K) COME15K dataset, which contains 15,625 image pairs with high quality polygon-/scribble-/object-/instance-/rank-level annotations. Based on these rich labels, we additionally construct four new benchmarks with strong baselines and observe some interesting phenomena, which can motivate future model design. Source code and dataset are available at https://github.com/JingZhang617/cascaded_rgbd_sod.",

author = "Jing Zhang and Fan, {Deng Ping} and Yuchao Dai and Xin Yu and Yiran Zhong and Nick Barnes and Ling Shao",

note = "Publisher Copyright: {\textcopyright} 2021 IEEE; 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021 ; Conference date: 11-10-2021 Through 17-10-2021",

year = "2021",

doi = "10.1109/ICCV48922.2021.00430",

language = "英语",

series = "Proceedings of the IEEE International Conference on Computer Vision",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "4318--4327",

booktitle = "Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021",

}

Zhang, J, Fan, DP, Dai, Y, Yu, X, Zhong, Y, Barnes, N & Shao, L 2021, RGB-D Saliency Detection via Cascaded Mutual Information Minimization. 在 Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021. Proceedings of the IEEE International Conference on Computer Vision, Institute of Electrical and Electronics Engineers Inc., 页码 4318-4327, 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021, Virtual, Online, 加拿大, 11/10/21. https://doi.org/10.1109/ICCV48922.2021.00430

RGB-D Saliency Detection via Cascaded Mutual Information Minimization. / Zhang, Jing; Fan, Deng Ping; Dai, Yuchao 等.
Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021. Institute of Electrical and Electronics Engineers Inc., 2021. 页码 4318-4327 (Proceedings of the IEEE International Conference on Computer Vision).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - RGB-D Saliency Detection via Cascaded Mutual Information Minimization

AU - Zhang, Jing

AU - Fan, Deng Ping

AU - Dai, Yuchao

AU - Yu, Xin

AU - Zhong, Yiran

AU - Barnes, Nick

AU - Shao, Ling

PY - 2021

Y1 - 2021

N2 - Existing RGB-D saliency detection models do not explicitly encourage RGB and depth to achieve effective multi-modal learning. In this paper, we introduce a novel multistage cascaded learning framework via mutual information minimization to explicitly model the multi-modal information between RGB image and depth data. Specifically, we first map the feature of each mode to a lower dimensional feature vector, and adopt mutual information minimization as a regularizer to reduce the redundancy between appearance features from RGB and geometric features from depth. We then perform multi-stage cascaded learning to impose the mutual information minimization constraint at every stage of the network. Extensive experiments on benchmark RGB-D saliency datasets illustrate the effectiveness of our framework. Further, to prosper the development of this field, we contribute the largest (7× larger than NJU2K) COME15K dataset, which contains 15,625 image pairs with high quality polygon-/scribble-/object-/instance-/rank-level annotations. Based on these rich labels, we additionally construct four new benchmarks with strong baselines and observe some interesting phenomena, which can motivate future model design. Source code and dataset are available at https://github.com/JingZhang617/cascaded_rgbd_sod.

AB - Existing RGB-D saliency detection models do not explicitly encourage RGB and depth to achieve effective multi-modal learning. In this paper, we introduce a novel multistage cascaded learning framework via mutual information minimization to explicitly model the multi-modal information between RGB image and depth data. Specifically, we first map the feature of each mode to a lower dimensional feature vector, and adopt mutual information minimization as a regularizer to reduce the redundancy between appearance features from RGB and geometric features from depth. We then perform multi-stage cascaded learning to impose the mutual information minimization constraint at every stage of the network. Extensive experiments on benchmark RGB-D saliency datasets illustrate the effectiveness of our framework. Further, to prosper the development of this field, we contribute the largest (7× larger than NJU2K) COME15K dataset, which contains 15,625 image pairs with high quality polygon-/scribble-/object-/instance-/rank-level annotations. Based on these rich labels, we additionally construct four new benchmarks with strong baselines and observe some interesting phenomena, which can motivate future model design. Source code and dataset are available at https://github.com/JingZhang617/cascaded_rgbd_sod.

UR - http://www.scopus.com/inward/record.url?scp=85117233669&partnerID=8YFLogxK

U2 - 10.1109/ICCV48922.2021.00430

DO - 10.1109/ICCV48922.2021.00430

M3 - 会议稿件

AN - SCOPUS:85117233669

T3 - Proceedings of the IEEE International Conference on Computer Vision

SP - 4318

EP - 4327

BT - Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021

Y2 - 11 October 2021 through 17 October 2021

ER -

Zhang J, Fan DP, Dai Y, Yu X, Zhong Y, Barnes N 等. RGB-D Saliency Detection via Cascaded Mutual Information Minimization. 在 Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021. Institute of Electrical and Electronics Engineers Inc. 2021. 页码 4318-4327. (Proceedings of the IEEE International Conference on Computer Vision). doi: 10.1109/ICCV48922.2021.00430

RGB-D Saliency Detection via Cascaded Mutual Information Minimization

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此