Learning depth via leveraging semantics: Self-supervised monocular depth estimation with both implicit and explicit semantic guidance

Rui Li; Danna Xue; Shaolin Su; Xiantuo He; Qing Mao; Yu Zhu; Jinqiu Sun; Yanning Zhang

doi:10.1016/j.patcog.2022.109297

Learning depth via leveraging semantics: Self-supervised monocular depth estimation with both implicit and explicit semantic guidance

Rui Li, Danna Xue, Shaolin Su, Xiantuo He, Qing Mao, Yu Zhu, Jinqiu Sun, Yanning Zhang

School of Computer Science

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

17 Scopus citations

Abstract

Self-supervised monocular depth estimation has shown great success in learning depth using only images for supervision. In this paper, we propose to enhance self-supervised depth estimation with semantics and propose a novel learning scheme, which incorporates both implicit and explicit semantic guidances. Specifically, we propose to relate depth distributions to the semantic category information by proposing a Semantic-aware Spatial Feature Modulation (SSFM) scheme, which implicitly modulates the semantic and depth features in a joint learning framework. The modulation parameters are generated from semantic labels to acquire category-level guidance. Meanwhile, a semantic-guided ranking loss is proposed to explicitly constrain the estimated depth borders using the corresponding segmentation labels. To avoid the impact brought by erroneous segmentation labels, both robust sampling strategy and prediction uncertainty weighting are proposed for the ranking loss. Extensive experimental results show that our method produces high-quality depth maps with semantically consistent depth distributions and accurate depth edges, outperforming the state-of-the-art methods by significant margins.

Original language	English
Article number	109297
Journal	Pattern Recognition
Volume	137
DOIs	https://doi.org/10.1016/j.patcog.2022.109297
State	Published - May 2023

Keywords

Robust point pair sampling
Semantic-aware spatial feature modulation
Semantic-guided ranking loss
Semantic-guided self-supervised depth estimation
Uncertainty weighting

Access to Document

10.1016/j.patcog.2022.109297

Cite this

@article{830a6c00af5249a2a7e25f8832a0cc13,

title = "Learning depth via leveraging semantics: Self-supervised monocular depth estimation with both implicit and explicit semantic guidance",

abstract = "Self-supervised monocular depth estimation has shown great success in learning depth using only images for supervision. In this paper, we propose to enhance self-supervised depth estimation with semantics and propose a novel learning scheme, which incorporates both implicit and explicit semantic guidances. Specifically, we propose to relate depth distributions to the semantic category information by proposing a Semantic-aware Spatial Feature Modulation (SSFM) scheme, which implicitly modulates the semantic and depth features in a joint learning framework. The modulation parameters are generated from semantic labels to acquire category-level guidance. Meanwhile, a semantic-guided ranking loss is proposed to explicitly constrain the estimated depth borders using the corresponding segmentation labels. To avoid the impact brought by erroneous segmentation labels, both robust sampling strategy and prediction uncertainty weighting are proposed for the ranking loss. Extensive experimental results show that our method produces high-quality depth maps with semantically consistent depth distributions and accurate depth edges, outperforming the state-of-the-art methods by significant margins.",

keywords = "Robust point pair sampling, Semantic-aware spatial feature modulation, Semantic-guided ranking loss, Semantic-guided self-supervised depth estimation, Uncertainty weighting",

author = "Rui Li and Danna Xue and Shaolin Su and Xiantuo He and Qing Mao and Yu Zhu and Jinqiu Sun and Yanning Zhang",

note = "Publisher Copyright: {\textcopyright} 2023 Elsevier Ltd",

year = "2023",

month = may,

doi = "10.1016/j.patcog.2022.109297",

language = "英语",

volume = "137",

journal = "Pattern Recognition",

issn = "0031-3203",

publisher = "Elsevier Ltd",

}

TY - JOUR

T1 - Learning depth via leveraging semantics

T2 - Self-supervised monocular depth estimation with both implicit and explicit semantic guidance

AU - Li, Rui

AU - Xue, Danna

AU - Su, Shaolin

AU - He, Xiantuo

AU - Mao, Qing

AU - Zhu, Yu

AU - Sun, Jinqiu

AU - Zhang, Yanning

PY - 2023/5

Y1 - 2023/5

N2 - Self-supervised monocular depth estimation has shown great success in learning depth using only images for supervision. In this paper, we propose to enhance self-supervised depth estimation with semantics and propose a novel learning scheme, which incorporates both implicit and explicit semantic guidances. Specifically, we propose to relate depth distributions to the semantic category information by proposing a Semantic-aware Spatial Feature Modulation (SSFM) scheme, which implicitly modulates the semantic and depth features in a joint learning framework. The modulation parameters are generated from semantic labels to acquire category-level guidance. Meanwhile, a semantic-guided ranking loss is proposed to explicitly constrain the estimated depth borders using the corresponding segmentation labels. To avoid the impact brought by erroneous segmentation labels, both robust sampling strategy and prediction uncertainty weighting are proposed for the ranking loss. Extensive experimental results show that our method produces high-quality depth maps with semantically consistent depth distributions and accurate depth edges, outperforming the state-of-the-art methods by significant margins.

AB - Self-supervised monocular depth estimation has shown great success in learning depth using only images for supervision. In this paper, we propose to enhance self-supervised depth estimation with semantics and propose a novel learning scheme, which incorporates both implicit and explicit semantic guidances. Specifically, we propose to relate depth distributions to the semantic category information by proposing a Semantic-aware Spatial Feature Modulation (SSFM) scheme, which implicitly modulates the semantic and depth features in a joint learning framework. The modulation parameters are generated from semantic labels to acquire category-level guidance. Meanwhile, a semantic-guided ranking loss is proposed to explicitly constrain the estimated depth borders using the corresponding segmentation labels. To avoid the impact brought by erroneous segmentation labels, both robust sampling strategy and prediction uncertainty weighting are proposed for the ranking loss. Extensive experimental results show that our method produces high-quality depth maps with semantically consistent depth distributions and accurate depth edges, outperforming the state-of-the-art methods by significant margins.

KW - Robust point pair sampling

KW - Semantic-aware spatial feature modulation

KW - Semantic-guided ranking loss

KW - Semantic-guided self-supervised depth estimation

KW - Uncertainty weighting

UR - http://www.scopus.com/inward/record.url?scp=85146583067&partnerID=8YFLogxK

U2 - 10.1016/j.patcog.2022.109297

DO - 10.1016/j.patcog.2022.109297

M3 - 文章

AN - SCOPUS:85146583067

SN - 0031-3203

VL - 137

JO - Pattern Recognition

JF - Pattern Recognition

M1 - 109297

ER -

Learning depth via leveraging semantics: Self-supervised monocular depth estimation with both implicit and explicit semantic guidance

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this