TY - GEN
T1 - Lightweight Multimodal Defect Detection at the Edge via Cross-Modal Distillation
AU - Wang, Baiqing
AU - Xing, Tao
AU - Liu, Xiaoning
AU - Peng, Zhe
AU - Cui, Helei
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - The learning capabilities of single-modality images are often severely limited and fail to meet the requirements of complexity defect detection in industrial settings. For instance, traditional visible light images are susceptible to environmental factors such as lighting and occlusions, while infrared images cannot capture texture details due to their low spatial resolution. Consequently, employing multiple image modalities typically yields better results than relying on a single modality. However, utilizing data from multiple modalities inevitably introduces additional computational costs, posing high hardware demands on edge computing devices, and the need for real-time detection in industrial environments is critical. To address these challenges, we propose a multimodal distillation approach that uses visible and infrared images as inputs to train a complex teacher model, while the student model continues to operate with a single-modal image input. Through knowledge transfer, the student model is enhanced, and model light-weighting is implemented to ensure that it can acquire multi-modal feature information while still meeting real-time performance requirements.
AB - The learning capabilities of single-modality images are often severely limited and fail to meet the requirements of complexity defect detection in industrial settings. For instance, traditional visible light images are susceptible to environmental factors such as lighting and occlusions, while infrared images cannot capture texture details due to their low spatial resolution. Consequently, employing multiple image modalities typically yields better results than relying on a single modality. However, utilizing data from multiple modalities inevitably introduces additional computational costs, posing high hardware demands on edge computing devices, and the need for real-time detection in industrial environments is critical. To address these challenges, we propose a multimodal distillation approach that uses visible and infrared images as inputs to train a complex teacher model, while the student model continues to operate with a single-modal image input. Through knowledge transfer, the student model is enhanced, and model light-weighting is implemented to ensure that it can acquire multi-modal feature information while still meeting real-time performance requirements.
KW - Defect detection
KW - edge intelligence
KW - model lightweight
KW - multi-modal
UR - http://www.scopus.com/inward/record.url?scp=85206385354&partnerID=8YFLogxK
U2 - 10.1109/IWQoS61813.2024.10682904
DO - 10.1109/IWQoS61813.2024.10682904
M3 - 会议稿件
AN - SCOPUS:85206385354
T3 - IEEE International Workshop on Quality of Service, IWQoS
BT - 2024 IEEE/ACM 32nd International Symposium on Quality of Service, IWQoS 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 32nd IEEE/ACM International Symposium on Quality of Service, IWQoS 2024
Y2 - 19 June 2024 through 21 June 2024
ER -