TY - JOUR
T1 - ‘Parallel-Circuitized’ distillation for dense object detection
AU - Song, Yaoye
AU - Zhang, Peng
AU - Huang, Wei
AU - Zha, Yufei
AU - You, Tao
AU - Zhang, Yanning
N1 - Publisher Copyright:
© 2023
PY - 2024/1
Y1 - 2024/1
N2 - As an effective model compression strategy, knowledge distillation allows lightweight student model to acquire knowledge from more expressive large-scale teacher model. Unfortunately, even distillation for object detection based on feature imitation is typically designed to solve the ratio imbalance of positive/negative samples, the recent dense object detection has a strong ability in this regard as well. Thus, the superposition of them leads to the law of diminishing returns, which means that the effect of such a knowledge distillation in dense object detection is not remarkable. Recent research has shown that response-based knowledge distillation schemes can overcome this limitation by directly mimicking the prediction of the teacher model, but the deficiency of attempts still limited a further progress in overall performance. By following the inspiration of analogizing the principle of parallel circuit to enhance effect of the dual-stream structured networks, in this work, a parallel knowledge distillation framework for dense object detection is proposed. Meanwhile, to further enables more reliable Localization Quality Estimation (LQE) for detection, A Soft Distribute-Guided Quality Predictor (SDGQP) is introduced for dynamical selection of distribution statistics. Additionally, with a localization quality distillation, the gap between classification and bounding box regression branch can be bridged based on more reliable localization quality score of SDGQP. Experiments on different benchmark datasets have shown that the proposed work is able to outperform other state-of-the-art dense object detection on both accuracy and robustness.
AB - As an effective model compression strategy, knowledge distillation allows lightweight student model to acquire knowledge from more expressive large-scale teacher model. Unfortunately, even distillation for object detection based on feature imitation is typically designed to solve the ratio imbalance of positive/negative samples, the recent dense object detection has a strong ability in this regard as well. Thus, the superposition of them leads to the law of diminishing returns, which means that the effect of such a knowledge distillation in dense object detection is not remarkable. Recent research has shown that response-based knowledge distillation schemes can overcome this limitation by directly mimicking the prediction of the teacher model, but the deficiency of attempts still limited a further progress in overall performance. By following the inspiration of analogizing the principle of parallel circuit to enhance effect of the dual-stream structured networks, in this work, a parallel knowledge distillation framework for dense object detection is proposed. Meanwhile, to further enables more reliable Localization Quality Estimation (LQE) for detection, A Soft Distribute-Guided Quality Predictor (SDGQP) is introduced for dynamical selection of distribution statistics. Additionally, with a localization quality distillation, the gap between classification and bounding box regression branch can be bridged based on more reliable localization quality score of SDGQP. Experiments on different benchmark datasets have shown that the proposed work is able to outperform other state-of-the-art dense object detection on both accuracy and robustness.
KW - Dense object detection
KW - Knowledge distillation
KW - Parallel circuit
UR - http://www.scopus.com/inward/record.url?scp=85179134097&partnerID=8YFLogxK
U2 - 10.1016/j.displa.2023.102587
DO - 10.1016/j.displa.2023.102587
M3 - 文献综述
AN - SCOPUS:85179134097
SN - 0141-9382
VL - 81
JO - Displays
JF - Displays
M1 - 102587
ER -