Lightweight Multimodal Defect Detection at the Edge via Cross-Modal Distillation

Baiqing Wang; Tao Xing; Xiaoning Liu; Zhe Peng; Helei Cui

doi:10.1109/IWQoS61813.2024.10682904

Lightweight Multimodal Defect Detection at the Edge via Cross-Modal Distillation

Baiqing Wang, Tao Xing, Xiaoning Liu, Zhe Peng, Helei Cui

School of Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

The learning capabilities of single-modality images are often severely limited and fail to meet the requirements of complexity defect detection in industrial settings. For instance, traditional visible light images are susceptible to environmental factors such as lighting and occlusions, while infrared images cannot capture texture details due to their low spatial resolution. Consequently, employing multiple image modalities typically yields better results than relying on a single modality. However, utilizing data from multiple modalities inevitably introduces additional computational costs, posing high hardware demands on edge computing devices, and the need for real-time detection in industrial environments is critical. To address these challenges, we propose a multimodal distillation approach that uses visible and infrared images as inputs to train a complex teacher model, while the student model continues to operate with a single-modal image input. Through knowledge transfer, the student model is enhanced, and model light-weighting is implemented to ensure that it can acquire multi-modal feature information while still meeting real-time performance requirements.

Original language	English
Title of host publication	2024 IEEE/ACM 32nd International Symposium on Quality of Service, IWQoS 2024
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9798350350128
DOIs	https://doi.org/10.1109/IWQoS61813.2024.10682904
State	Published - 2024
Event	32nd IEEE/ACM International Symposium on Quality of Service, IWQoS 2024 - Guangzhou, China Duration: 19 Jun 2024 → 21 Jun 2024

Publication series

Name	IEEE International Workshop on Quality of Service, IWQoS
ISSN (Print)	1548-615X

Conference

Conference	32nd IEEE/ACM International Symposium on Quality of Service, IWQoS 2024
Country/Territory	China
City	Guangzhou
Period	19/06/24 → 21/06/24

Keywords

Defect detection
edge intelligence
model lightweight
multi-modal

Access to Document

10.1109/IWQoS61813.2024.10682904

Cite this

Wang, B., Xing, T., Liu, X., Peng, Z., & Cui, H. (2024). Lightweight Multimodal Defect Detection at the Edge via Cross-Modal Distillation. In 2024 IEEE/ACM 32nd International Symposium on Quality of Service, IWQoS 2024 (IEEE International Workshop on Quality of Service, IWQoS). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IWQoS61813.2024.10682904

@inproceedings{a32f9d9ee1434711979d402316b5f93a,

title = "Lightweight Multimodal Defect Detection at the Edge via Cross-Modal Distillation",

abstract = "The learning capabilities of single-modality images are often severely limited and fail to meet the requirements of complexity defect detection in industrial settings. For instance, traditional visible light images are susceptible to environmental factors such as lighting and occlusions, while infrared images cannot capture texture details due to their low spatial resolution. Consequently, employing multiple image modalities typically yields better results than relying on a single modality. However, utilizing data from multiple modalities inevitably introduces additional computational costs, posing high hardware demands on edge computing devices, and the need for real-time detection in industrial environments is critical. To address these challenges, we propose a multimodal distillation approach that uses visible and infrared images as inputs to train a complex teacher model, while the student model continues to operate with a single-modal image input. Through knowledge transfer, the student model is enhanced, and model light-weighting is implemented to ensure that it can acquire multi-modal feature information while still meeting real-time performance requirements.",

keywords = "Defect detection, edge intelligence, model lightweight, multi-modal",

author = "Baiqing Wang and Tao Xing and Xiaoning Liu and Zhe Peng and Helei Cui",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 32nd IEEE/ACM International Symposium on Quality of Service, IWQoS 2024 ; Conference date: 19-06-2024 Through 21-06-2024",

year = "2024",

doi = "10.1109/IWQoS61813.2024.10682904",

language = "英语",

series = "IEEE International Workshop on Quality of Service, IWQoS",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "2024 IEEE/ACM 32nd International Symposium on Quality of Service, IWQoS 2024",

}

Wang, B, Xing, T, Liu, X, Peng, Z & Cui, H 2024, Lightweight Multimodal Defect Detection at the Edge via Cross-Modal Distillation. in 2024 IEEE/ACM 32nd International Symposium on Quality of Service, IWQoS 2024. IEEE International Workshop on Quality of Service, IWQoS, Institute of Electrical and Electronics Engineers Inc., 32nd IEEE/ACM International Symposium on Quality of Service, IWQoS 2024, Guangzhou, China, 19/06/24. https://doi.org/10.1109/IWQoS61813.2024.10682904

Lightweight Multimodal Defect Detection at the Edge via Cross-Modal Distillation. / Wang, Baiqing; Xing, Tao; Liu, Xiaoning et al.
2024 IEEE/ACM 32nd International Symposium on Quality of Service, IWQoS 2024. Institute of Electrical and Electronics Engineers Inc., 2024. (IEEE International Workshop on Quality of Service, IWQoS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Lightweight Multimodal Defect Detection at the Edge via Cross-Modal Distillation

AU - Wang, Baiqing

AU - Xing, Tao

AU - Liu, Xiaoning

AU - Peng, Zhe

AU - Cui, Helei

PY - 2024

Y1 - 2024

N2 - The learning capabilities of single-modality images are often severely limited and fail to meet the requirements of complexity defect detection in industrial settings. For instance, traditional visible light images are susceptible to environmental factors such as lighting and occlusions, while infrared images cannot capture texture details due to their low spatial resolution. Consequently, employing multiple image modalities typically yields better results than relying on a single modality. However, utilizing data from multiple modalities inevitably introduces additional computational costs, posing high hardware demands on edge computing devices, and the need for real-time detection in industrial environments is critical. To address these challenges, we propose a multimodal distillation approach that uses visible and infrared images as inputs to train a complex teacher model, while the student model continues to operate with a single-modal image input. Through knowledge transfer, the student model is enhanced, and model light-weighting is implemented to ensure that it can acquire multi-modal feature information while still meeting real-time performance requirements.

AB - The learning capabilities of single-modality images are often severely limited and fail to meet the requirements of complexity defect detection in industrial settings. For instance, traditional visible light images are susceptible to environmental factors such as lighting and occlusions, while infrared images cannot capture texture details due to their low spatial resolution. Consequently, employing multiple image modalities typically yields better results than relying on a single modality. However, utilizing data from multiple modalities inevitably introduces additional computational costs, posing high hardware demands on edge computing devices, and the need for real-time detection in industrial environments is critical. To address these challenges, we propose a multimodal distillation approach that uses visible and infrared images as inputs to train a complex teacher model, while the student model continues to operate with a single-modal image input. Through knowledge transfer, the student model is enhanced, and model light-weighting is implemented to ensure that it can acquire multi-modal feature information while still meeting real-time performance requirements.

KW - Defect detection

KW - edge intelligence

KW - model lightweight

KW - multi-modal

UR - http://www.scopus.com/inward/record.url?scp=85206385354&partnerID=8YFLogxK

U2 - 10.1109/IWQoS61813.2024.10682904

DO - 10.1109/IWQoS61813.2024.10682904

M3 - 会议稿件

AN - SCOPUS:85206385354

T3 - IEEE International Workshop on Quality of Service, IWQoS

BT - 2024 IEEE/ACM 32nd International Symposium on Quality of Service, IWQoS 2024

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 32nd IEEE/ACM International Symposium on Quality of Service, IWQoS 2024

Y2 - 19 June 2024 through 21 June 2024

ER -

Lightweight Multimodal Defect Detection at the Edge via Cross-Modal Distillation

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this