Multi-modal Spatio-temporal Transformer for Defect Recognition of Substation Equipment

Yiyang Yao, Zexing Du, Xue Wang, Qing Wang

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

The utilization of multi-spectral imaging, such as infrared, visible light, and ultraviolet, for recognizing defects in electrical equipment mostly focuses on static measurements and lacks exploration into the dynamic process of defect development. To better exploit dynamic measurements, this paper proposes a novel defect recognition method using tri-spectral videos. Specifically, a multi-modal spatio-temporal Transformer is presented to effectively decompose spatio-temporal features present in various modalities. Besides, a spatio-temporal multi-modal contrastive loss is introduced for self-supervised learning. By aligning extracted features both spatially and temporally across modalities, this loss helps mitigate confusion between modalities and improve the discriminative capacity of learned representations. To evaluate the proposed method, we self-collect a tri-spectral dataset, TROPED, which covers a wide range of dynamic defects in operational substation equipment, and benchmark results on the dataset. Experimental results demonstrate the effectiveness and robustness of the proposed method against other state-of-the-art methods.

源语言英语
主期刊名Artificial Intelligence and Robotics - 9th International Symposium, ISAIR 2024, Revised Selected Papers
编辑Huimin Lu
出版商Springer Science and Business Media Deutschland GmbH
211-222
页数12
ISBN(印刷版)9789819629138
DOI
出版状态已出版 - 2025
活动9th International Symposium on Artificial Intelligence and Robotics, ISAIR 2024 - Guilin, 中国
期限: 27 9月 202430 9月 2024

出版系列

姓名Communications in Computer and Information Science
2403 CCIS
ISSN(印刷版)1865-0929
ISSN(电子版)1865-0937

会议

会议9th International Symposium on Artificial Intelligence and Robotics, ISAIR 2024
国家/地区中国
Guilin
时期27/09/2430/09/24

指纹

探究 'Multi-modal Spatio-temporal Transformer for Defect Recognition of Substation Equipment' 的科研主题。它们共同构成独一无二的指纹。

引用此