HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation

Linglin Jing; Yiming Ding; Yunpeng Gao; Zhigang Wang; Xu Yan; Dong Wang; Gerald Schaefer; Hui Fang; Bin Zhao; Xuelong Li

doi:10.1109/CVPR52733.2024.02182

HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation

Linglin Jing, Yiming Ding, Yunpeng Gao, Zhigang Wang, Xu Yan, Dong Wang, Gerald Schaefer, Hui Fang, Bin Zhao, Xuelong Li

School of Artificial Intelligence, OPtics and Electronics

Research output: Contribution to journal › Conference article › peer-review

5 Scopus citations

Abstract

Event-based semantic segmentation has gained popularity due to its capability to deal with scenarios under high-speed motion and extreme lighting conditions, which cannot be addressed by conventional RGB cameras. Since it is hard to annotate event data, previous approaches rely on event-to-image reconstruction to obtain pseudo labels for training. However, this will inevitably introduce noise, and learning from noisy pseudo labels, especially when generated from a single source, may reinforce the errors. This drawback is also called confirmation bias in pseudo-labeling. In this paper, we propose a novel hybrid pseudo-labeling framework for unsupervised event-based semantic segmentation, HPL-ESS, to alleviate the influence of noisy pseudo labels. Specifically, we first employ a plain unsupervised domain adaptation framework as our baseline, which can generate a set of pseudo labels through self-training. Then, we incorporate offline event-to-image re-construction into the framework, and obtain another set of pseudo labels by predicting segmentation maps on the re-constructed images. A noisy label learning strategy is designed to mix the two sets of pseudo labels and enhance the quality. Moreover, we propose a soft prototypical alignment (SPA) module to further improve the consistency of target domain features. Extensive experiments show that the proposed method outperforms existing state-of-the-art methods by a large margin on benchmarks (e.g., +5.88% accuracy, +10.32% mIoU on DSEC-Semantic dataset), and even surpasses several supervised methods.

Original language	English
Pages (from-to)	23128-23137
Number of pages	10
Journal	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
DOIs	https://doi.org/10.1109/CVPR52733.2024.02182
State	Published - 2024
Event	2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024 - Seattle, United States Duration: 16 Jun 2024 → 22 Jun 2024

Keywords

Event Camera
Segmentation
Unsupervised Representation Learning

Access to Document

10.1109/CVPR52733.2024.02182

Cite this

@article{6fad8f1cb8344d84a719564fda7ff952,

title = "HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation",

abstract = "Event-based semantic segmentation has gained popularity due to its capability to deal with scenarios under high-speed motion and extreme lighting conditions, which cannot be addressed by conventional RGB cameras. Since it is hard to annotate event data, previous approaches rely on event-to-image reconstruction to obtain pseudo labels for training. However, this will inevitably introduce noise, and learning from noisy pseudo labels, especially when generated from a single source, may reinforce the errors. This drawback is also called confirmation bias in pseudo-labeling. In this paper, we propose a novel hybrid pseudo-labeling framework for unsupervised event-based semantic segmentation, HPL-ESS, to alleviate the influence of noisy pseudo labels. Specifically, we first employ a plain unsupervised domain adaptation framework as our baseline, which can generate a set of pseudo labels through self-training. Then, we incorporate offline event-to-image re-construction into the framework, and obtain another set of pseudo labels by predicting segmentation maps on the re-constructed images. A noisy label learning strategy is designed to mix the two sets of pseudo labels and enhance the quality. Moreover, we propose a soft prototypical alignment (SPA) module to further improve the consistency of target domain features. Extensive experiments show that the proposed method outperforms existing state-of-the-art methods by a large margin on benchmarks (e.g., +5.88% accuracy, +10.32% mIoU on DSEC-Semantic dataset), and even surpasses several supervised methods.",

keywords = "Event Camera, Segmentation, Unsupervised Representation Learning",

author = "Linglin Jing and Yiming Ding and Yunpeng Gao and Zhigang Wang and Xu Yan and Dong Wang and Gerald Schaefer and Hui Fang and Bin Zhao and Xuelong Li",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024 ; Conference date: 16-06-2024 Through 22-06-2024",

year = "2024",

doi = "10.1109/CVPR52733.2024.02182",

language = "英语",

pages = "23128--23137",

journal = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

issn = "1063-6919",

publisher = "IEEE Computer Society",

}

TY - JOUR

T1 - HPL-ESS

T2 - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024

AU - Jing, Linglin

AU - Ding, Yiming

AU - Gao, Yunpeng

AU - Wang, Zhigang

AU - Yan, Xu

AU - Wang, Dong

AU - Schaefer, Gerald

AU - Fang, Hui

AU - Zhao, Bin

AU - Li, Xuelong

PY - 2024

Y1 - 2024

N2 - Event-based semantic segmentation has gained popularity due to its capability to deal with scenarios under high-speed motion and extreme lighting conditions, which cannot be addressed by conventional RGB cameras. Since it is hard to annotate event data, previous approaches rely on event-to-image reconstruction to obtain pseudo labels for training. However, this will inevitably introduce noise, and learning from noisy pseudo labels, especially when generated from a single source, may reinforce the errors. This drawback is also called confirmation bias in pseudo-labeling. In this paper, we propose a novel hybrid pseudo-labeling framework for unsupervised event-based semantic segmentation, HPL-ESS, to alleviate the influence of noisy pseudo labels. Specifically, we first employ a plain unsupervised domain adaptation framework as our baseline, which can generate a set of pseudo labels through self-training. Then, we incorporate offline event-to-image re-construction into the framework, and obtain another set of pseudo labels by predicting segmentation maps on the re-constructed images. A noisy label learning strategy is designed to mix the two sets of pseudo labels and enhance the quality. Moreover, we propose a soft prototypical alignment (SPA) module to further improve the consistency of target domain features. Extensive experiments show that the proposed method outperforms existing state-of-the-art methods by a large margin on benchmarks (e.g., +5.88% accuracy, +10.32% mIoU on DSEC-Semantic dataset), and even surpasses several supervised methods.

AB - Event-based semantic segmentation has gained popularity due to its capability to deal with scenarios under high-speed motion and extreme lighting conditions, which cannot be addressed by conventional RGB cameras. Since it is hard to annotate event data, previous approaches rely on event-to-image reconstruction to obtain pseudo labels for training. However, this will inevitably introduce noise, and learning from noisy pseudo labels, especially when generated from a single source, may reinforce the errors. This drawback is also called confirmation bias in pseudo-labeling. In this paper, we propose a novel hybrid pseudo-labeling framework for unsupervised event-based semantic segmentation, HPL-ESS, to alleviate the influence of noisy pseudo labels. Specifically, we first employ a plain unsupervised domain adaptation framework as our baseline, which can generate a set of pseudo labels through self-training. Then, we incorporate offline event-to-image re-construction into the framework, and obtain another set of pseudo labels by predicting segmentation maps on the re-constructed images. A noisy label learning strategy is designed to mix the two sets of pseudo labels and enhance the quality. Moreover, we propose a soft prototypical alignment (SPA) module to further improve the consistency of target domain features. Extensive experiments show that the proposed method outperforms existing state-of-the-art methods by a large margin on benchmarks (e.g., +5.88% accuracy, +10.32% mIoU on DSEC-Semantic dataset), and even surpasses several supervised methods.

KW - Event Camera

KW - Segmentation

KW - Unsupervised Representation Learning

UR - http://www.scopus.com/inward/record.url?scp=85205723479&partnerID=8YFLogxK

U2 - 10.1109/CVPR52733.2024.02182

DO - 10.1109/CVPR52733.2024.02182

M3 - 会议文章

AN - SCOPUS:85205723479

SN - 1063-6919

SP - 23128

EP - 23137

JO - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

JF - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Y2 - 16 June 2024 through 22 June 2024

ER -

HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this