摘要
This paper proposes a novel multimodal fusion network (MRDFNet) for egocentric hand action recognition from RGB-D videos. First, we utilize three separate streams to extract individual spatio-temporal features for different modalities, which include RGB frames, optical flow stacks, and depth frames. Particularly, for RGB and depth streams, an Attention-based Bidirectional Long Short-Term Memory network (Bi-LSTA) is used to identify regions of interest both spatially and temporally. Then, the extracted features are fed into a fusion module to obtain the integrated feature, which is finally used for egocentric hand action recognition. The fusion module is capable of learning complementary information from multiple modalities, i.e., preserving the distinctive property for each modality and meanwhile exploring the shareable property across modalities. Experimental results on both self-collected RGB-D Egocentric Manual Operation Dataset in Electrical Substations (REMOD-ES) and the THU-READ containing daily-life actions show the superiority of the proposed approach over state-of-the-art methods.
| 源语言 | 英语 |
|---|---|
| 主期刊名 | 2023 IEEE International Conference on Energy Technologies for Future Grids, ETFG 2023 |
| 出版商 | Institute of Electrical and Electronics Engineers Inc. |
| ISBN(电子版) | 9781665471640 |
| DOI | |
| 出版状态 | 已出版 - 2023 |
| 活动 | 2023 IEEE International Conference on Energy Technologies for Future Grids, ETFG 2023 - Wollongong, 澳大利亚 期限: 3 12月 2023 → 6 12月 2023 |
出版系列
| 姓名 | 2023 IEEE International Conference on Energy Technologies for Future Grids, ETFG 2023 |
|---|
会议
| 会议 | 2023 IEEE International Conference on Energy Technologies for Future Grids, ETFG 2023 |
|---|---|
| 国家/地区 | 澳大利亚 |
| 市 | Wollongong |
| 时期 | 3/12/23 → 6/12/23 |
联合国可持续发展目标
此成果有助于实现下列可持续发展目标:
-
可持续发展目标 7 经济适用的清洁能源
指纹
探究 'Hand Action Recognition from RGB-D Egocentric Videos in Substations Operations and Maintenance' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver