TY - JOUR
T1 - Adversarial Multi-view Networks for Activity Recognition
AU - Bai, Lei
AU - Yao, Lina
AU - Wang, Xianzhi
AU - Kanhere, Salil S.
AU - Guo, Bin
AU - Yu, Zhiwen
N1 - Publisher Copyright:
© 2020 ACM.
PY - 2020/6/15
Y1 - 2020/6/15
N2 - Human activity recognition (HAR) plays an irreplaceable role in various applications and has been a prosperous research topic for years. Recent studies show significant progress in feature extraction (i.e., data representation) using deep learning techniques. However, they face significant challenges in capturing multi-modal spatial-temporal patterns from the sensory data, and they commonly overlook the variants between subjects. We propose a Discriminative Adversarial MUlti-view Network (DAMUN) to address the above issues in sensor-based HAR. We first design a multi-view feature extractor to obtain representations of sensory data streams from temporal, spatial, and spatio-temporal views using convolutional networks. Then, we fuse the multi-view representations into a robust joint representation through a trainable Hadamard fusion module, and finally employ a Siamese adversarial network architecture to decrease the variants between the representations of different subjects. We have conducted extensive experiments under an iterative left-one-subject-out setting on three real-world datasets and demonstrated both the effectiveness and robustness of our approach.
AB - Human activity recognition (HAR) plays an irreplaceable role in various applications and has been a prosperous research topic for years. Recent studies show significant progress in feature extraction (i.e., data representation) using deep learning techniques. However, they face significant challenges in capturing multi-modal spatial-temporal patterns from the sensory data, and they commonly overlook the variants between subjects. We propose a Discriminative Adversarial MUlti-view Network (DAMUN) to address the above issues in sensor-based HAR. We first design a multi-view feature extractor to obtain representations of sensory data streams from temporal, spatial, and spatio-temporal views using convolutional networks. Then, we fuse the multi-view representations into a robust joint representation through a trainable Hadamard fusion module, and finally employ a Siamese adversarial network architecture to decrease the variants between the representations of different subjects. We have conducted extensive experiments under an iterative left-one-subject-out setting on three real-world datasets and demonstrated both the effectiveness and robustness of our approach.
KW - Activity Recognition
KW - Adversarial Training
KW - Deep Learning
KW - Multi-view Representation
UR - http://www.scopus.com/inward/record.url?scp=85089756468&partnerID=8YFLogxK
U2 - 10.1145/3397323
DO - 10.1145/3397323
M3 - 文章
AN - SCOPUS:85089756468
SN - 2474-9567
VL - 4
JO - Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
JF - Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
IS - 2
M1 - 42
ER -