TY - JOUR
T1 - A Dual Pipeline with Spatio-Temporal Attention Fusion Approach for Human Activity Recognition
AU - Wang, Xiaodong
AU - Li, Ying
AU - Fang, Aiqing
AU - He, Pei
AU - Guo, Yangming
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Sensor-based human activity recognition (SHAR) has gained more attention due to the rapid development of the Internet of Things (IoT). The critical issue for SHAR is rescuing the performance bottleneck from expensive feature engineering. Recent works have explored combining hybrid neural networks to improve the SHAR model architecture for learning informative representation. However, existing studies have not adequately provided a hierarchical structure that can represent human activities and capture specific representations hidden beneath interrelated low-level human activity sequences. In this work, we introduce a dual pipeline with a spatio-temporal attention fusion approach, termed the ST-attention dual pipeline, to address this problem. Specifically, the ST-attention dual pipeline employs sequence learning techniques in one pipeline to capture complex dependencies within behavior data and residual learning techniques in another pipeline to extract hierarchical details, then fuse them by the ST-attention fusion mechanism generated across spatial and temporal dimensions to improve presentation capabilities. Extensive experiments on public datasets (i.e., OPPORTUNITY, PAMAP2, and USC-HAD) have shown the ST-attention dual pipeline yields compelling results, and the spatio-temporal attention mechanism also achieves superior performance over other fusion methods.
AB - Sensor-based human activity recognition (SHAR) has gained more attention due to the rapid development of the Internet of Things (IoT). The critical issue for SHAR is rescuing the performance bottleneck from expensive feature engineering. Recent works have explored combining hybrid neural networks to improve the SHAR model architecture for learning informative representation. However, existing studies have not adequately provided a hierarchical structure that can represent human activities and capture specific representations hidden beneath interrelated low-level human activity sequences. In this work, we introduce a dual pipeline with a spatio-temporal attention fusion approach, termed the ST-attention dual pipeline, to address this problem. Specifically, the ST-attention dual pipeline employs sequence learning techniques in one pipeline to capture complex dependencies within behavior data and residual learning techniques in another pipeline to extract hierarchical details, then fuse them by the ST-attention fusion mechanism generated across spatial and temporal dimensions to improve presentation capabilities. Extensive experiments on public datasets (i.e., OPPORTUNITY, PAMAP2, and USC-HAD) have shown the ST-attention dual pipeline yields compelling results, and the spatio-temporal attention mechanism also achieves superior performance over other fusion methods.
KW - Attention mechanism
KW - depthwise separable convolution (DSC)
KW - human activity recognition (HAR)
KW - hybrid neural network
KW - wearable sensors
UR - http://www.scopus.com/inward/record.url?scp=85197631788&partnerID=8YFLogxK
U2 - 10.1109/JSEN.2024.3416295
DO - 10.1109/JSEN.2024.3416295
M3 - 文章
AN - SCOPUS:85197631788
SN - 1530-437X
VL - 24
SP - 25150
EP - 25162
JO - IEEE Sensors Journal
JF - IEEE Sensors Journal
IS - 15
ER -