TY - GEN
T1 - Mapping EEG Signals to Visual Stimuli
T2 - 14th IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2024
AU - Yang, Yiqian
AU - Zhao, Zhengqiao
AU - Wang, Qian
AU - Yang, Yan
AU - Chen, Jingdong
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Existing approaches to modeling associations between visual stimuli and brain responses are facing difficulties in handling between-subject variance and model generalization. Inspired by the recent progress in modeling speech-brain response, we propose a 'match-vs-mismatch' deep learning model in this study to classify whether a video clip elicits neural responses in recorded EEG signals. Our model employs dilated convolutional neural networks and gated recurrent units to extract features from both EEG and video signals, enabling the learning of associations between visual content and corresponding neural recordings. We demonstrate that our proposed model achieves the highest accuracy on unseen subjects compared to other baseline models. Additionally, we assess inter-subject noise using a subject-level silhouette score in the embedding space, revealing that our model effectively mitigates inter-subject noise and significantly reduces the silhouette score. Furthermore, we investigate Grad-CAM activation scores, revealing that brain regions linked to language processing contribute most to model predictions, followed by regions associated with visual processing. These findings hold promise for advancing neural recording-based video reconstruction and related applications.
AB - Existing approaches to modeling associations between visual stimuli and brain responses are facing difficulties in handling between-subject variance and model generalization. Inspired by the recent progress in modeling speech-brain response, we propose a 'match-vs-mismatch' deep learning model in this study to classify whether a video clip elicits neural responses in recorded EEG signals. Our model employs dilated convolutional neural networks and gated recurrent units to extract features from both EEG and video signals, enabling the learning of associations between visual content and corresponding neural recordings. We demonstrate that our proposed model achieves the highest accuracy on unseen subjects compared to other baseline models. Additionally, we assess inter-subject noise using a subject-level silhouette score in the embedding space, revealing that our model effectively mitigates inter-subject noise and significantly reduces the silhouette score. Furthermore, we investigate Grad-CAM activation scores, revealing that brain regions linked to language processing contribute most to model predictions, followed by regions associated with visual processing. These findings hold promise for advancing neural recording-based video reconstruction and related applications.
KW - EEG
KW - deep learning
KW - neural representation
KW - visual content reconstruction
UR - https://www.scopus.com/pages/publications/85214890803
U2 - 10.1109/ICSPCC62635.2024.10770428
DO - 10.1109/ICSPCC62635.2024.10770428
M3 - 会议稿件
AN - SCOPUS:85214890803
T3 - 2024 IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2024
BT - 2024 IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 19 August 2024 through 22 August 2024
ER -