Skip to main navigation Skip to search Skip to main content

Decoding multilingual imagined speech from scalp EEG via dynamic differentiable graph hierarchical fusion network

  • Zhongjie Li
  • , Shengrui He
  • , Jinfeng Huang
  • , Gaoyan Zhang
  • , Jianwu Dang
  • , Shijie Zhao
  • , Junwei Han
  • Tianjin University
  • Northwestern Polytechnical University Xian
  • Shenzhen University
  • Shenzhen Institute of Advanced Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Decoding imagined speech using Electroencephalography (EEG)–based brain computer interface (BCI) technique has great potential in restoring communication for individuals with severe speech impairments. Despite recent progress, decoding performance remains limited with scalp EEG due to the inherently low signal-to-noise ratio and the subtle neural representations. Most existing methods overlook the hierarchical fusion of multiscale temporal dependence under adaptive scalp-level functional topology. Moreover, decoding multilingual imagined speech presents additional challenges due to cross-linguistic differences in phonological structure and neural activation patterns, necessitating models with superior generalization capabilities. To address these issues, we propose a Dynamic Differentiable Graph Hierarchical Fusion Network (DDGHFNet) for multilingual imagined-speech decoding from scalp EEG. The model performs feature-level fusion via multi-scale temporal convolutions with kernel-level attentive mixing, and structure-level fusion via hierarchical local–global graph filtering with subject-specific dynamic adjacencies, forming a unified end-to-end spatiotemporal pipeline. We built a multilingual 8-class dataset and evaluated the DDGHFNet on both our dataset and the public Spanish inner-speech corpus “Thinking out loud” by a subject-independent way. Experimental results demonstrate that DDGHFNet surpasses strong baselines by 1.5–18.9 percentage points on our dataset and establishes a new state-of-the-art accuracy of 33.2% on the Spanish corpus without subject-specific fine-tuning. The ablation experiments and visualization analyses verified the effectiveness and explainability of the feature learning and fusion strategies, indicating that principled fusion across time and graph structure enables robust cross-linguistic generalization for practical, language-flexible BCIs.

Original languageEnglish
Article number104262
JournalInformation Fusion
Volume133
DOIs
StatePublished - Sep 2026

Keywords

  • Brain-computer interface (BCI)
  • Inner speech decoding
  • Multilingual imagery
  • Scalp electroencephalography (EEG)
  • Spatiotemporal graph features

Fingerprint

Dive into the research topics of 'Decoding multilingual imagined speech from scalp EEG via dynamic differentiable graph hierarchical fusion network'. Together they form a unique fingerprint.

Cite this