TY - JOUR
T1 - Ground truth is the best teacher
T2 - supervised semantic segmentation inspired by knowledge transfer mechanisms
AU - Yu, Xiangchun
AU - Liu, Huofa
AU - Zhang, Dingwen
AU - Liang, Miaomiao
AU - Yu, Lingjuan
AU - Zheng, Jian
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.
PY - 2025/2
Y1 - 2025/2
N2 - Knowledge distillation typically requires additional distillation costs to improve model performance. In this paper, our focus lies in the straightforward construction of task-level losses by mimicking the knowledge transfer mechanism embedded in the existing logits-based knowledge distillation. Firstly, we put forward a method that enables direct knowledge transfer from the ground truth, with the aim of eliminating the supplementary costs linked to traditional distillation methods. Furthermore, we introduce a strategy to address the issue of overconfident softmax predictions that may emerge from this direct transfer. By applying a linear mapping to the ground truth, we can effectively regulate the model’s outputs and thus enhance the reliability of predictions. We carry out extensive experiments on the Cityscapes dataset, the Pascal Context dataset, ADE20K, and COCO Stuff164k, respectively. Both the experimental and visualization results illustrate that our proposed methods surpass the state-of-the-art KD methods in terms of training efficiency and segmentation performance.
AB - Knowledge distillation typically requires additional distillation costs to improve model performance. In this paper, our focus lies in the straightforward construction of task-level losses by mimicking the knowledge transfer mechanism embedded in the existing logits-based knowledge distillation. Firstly, we put forward a method that enables direct knowledge transfer from the ground truth, with the aim of eliminating the supplementary costs linked to traditional distillation methods. Furthermore, we introduce a strategy to address the issue of overconfident softmax predictions that may emerge from this direct transfer. By applying a linear mapping to the ground truth, we can effectively regulate the model’s outputs and thus enhance the reliability of predictions. We carry out extensive experiments on the Cityscapes dataset, the Pascal Context dataset, ADE20K, and COCO Stuff164k, respectively. Both the experimental and visualization results illustrate that our proposed methods surpass the state-of-the-art KD methods in terms of training efficiency and segmentation performance.
KW - Knowledge distillation
KW - Knowledge transfer mechanism
KW - Non-rigid knowledge transfer
KW - Semantic segmentation
KW - Softmax overconfidence
UR - https://www.scopus.com/pages/publications/85214107043
U2 - 10.1007/s00530-024-01620-5
DO - 10.1007/s00530-024-01620-5
M3 - 文章
AN - SCOPUS:85214107043
SN - 0942-4962
VL - 31
JO - Multimedia Systems
JF - Multimedia Systems
IS - 1
M1 - 41
ER -