Abstract
Depression is a common mental illness that has a potential impact on public safety in society. Traditional clinical assessment of depression mainly relies on interviews and scales, which suffer from strong subjectivity and low efficiency. However, current research is difficult to characterize heterogeneous mode alignment, cross-modal fine-grained interaction, and global feature modeling. As a result, it is unable to fully explore the potential information related to depression in multimodal data, which affects the accuracy of depression recognition. Therefore, this study proposes a graph clustering fusion network for depression recognition. Firstly, by using one-dimensional convolution and linear mapping, the video and audio features are unified into the same token space to obtain aligned feature representations; Secondly, a shared Transformer is constructed and a bidirectional feature fusion attention mechanism is designed to model the conditional dependencies between audio and video at the token level; Finally, the fusion token is explicitly modeled as a graph structure, and graph convolution and soft clustering pooling are introduced to extract a small number of task related semantic prototypes, thereby forming a robust global representation. Extensive experiments on public datasets demonstrate that the proposed method significantly outperforms the competitors.
| Original language | English |
|---|---|
| Title of host publication | International Conference on Machine Learning and Artificial Intelligence Applications, MLAIA 2025 |
| Editors | Jianhua Zhou |
| Publisher | SPIE |
| ISBN (Electronic) | 9798902322276 |
| DOIs | |
| State | Published - 9 Mar 2026 |
| Event | International Conference on Machine Learning and Artificial Intelligence Applications, MLAIA 2025 - Shaoyang, China Duration: 12 Dec 2025 → 14 Dec 2025 |
Publication series
| Name | Proceedings of SPIE - The International Society for Optical Engineering |
|---|---|
| Volume | 14134 |
| ISSN (Print) | 0277-786X |
| ISSN (Electronic) | 1996-756X |
Conference
| Conference | International Conference on Machine Learning and Artificial Intelligence Applications, MLAIA 2025 |
|---|---|
| Country/Territory | China |
| City | Shaoyang |
| Period | 12/12/25 → 14/12/25 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Deep Learning
- Depression Recognition
- Information Fusion
Fingerprint
Dive into the research topics of 'A Graph Clustering Fusion Network for Depression Recognition'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver