Hand Action Recognition from RGB-D Egocentric Videos in Substations Operations and Maintenance

Yiyang Yao, Xue Wang, Guoqing Zhou, Qing Wang

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

This paper proposes a novel multimodal fusion network (MRDFNet) for egocentric hand action recognition from RGB-D videos. First, we utilize three separate streams to extract individual spatio-temporal features for different modalities, which include RGB frames, optical flow stacks, and depth frames. Particularly, for RGB and depth streams, an Attention-based Bidirectional Long Short-Term Memory network (Bi-LSTA) is used to identify regions of interest both spatially and temporally. Then, the extracted features are fed into a fusion module to obtain the integrated feature, which is finally used for egocentric hand action recognition. The fusion module is capable of learning complementary information from multiple modalities, i.e., preserving the distinctive property for each modality and meanwhile exploring the shareable property across modalities. Experimental results on both self-collected RGB-D Egocentric Manual Operation Dataset in Electrical Substations (REMOD-ES) and the THU-READ containing daily-life actions show the superiority of the proposed approach over state-of-the-art methods.

源语言英语
主期刊名2023 IEEE International Conference on Energy Technologies for Future Grids, ETFG 2023
出版商Institute of Electrical and Electronics Engineers Inc.
ISBN(电子版)9781665471640
DOI
出版状态已出版 - 2023
活动2023 IEEE International Conference on Energy Technologies for Future Grids, ETFG 2023 - Wollongong, 澳大利亚
期限: 3 12月 20236 12月 2023

出版系列

姓名2023 IEEE International Conference on Energy Technologies for Future Grids, ETFG 2023

会议

会议2023 IEEE International Conference on Energy Technologies for Future Grids, ETFG 2023
国家/地区澳大利亚
Wollongong
时期3/12/236/12/23

指纹

探究 'Hand Action Recognition from RGB-D Egocentric Videos in Substations Operations and Maintenance' 的科研主题。它们共同构成独一无二的指纹。

引用此