DESNet: A Multi-Channel Network for Simultaneous Speech Dereverberation, Enhancement and Separation

Yihui Fu, Jian Wu, Yanxin Hu, Mengtao Xing, Lei Xie

科研成果: 书/报告/会议事项章节会议稿件同行评审

24 引用 (Scopus)

摘要

In this paper, we propose a multi-channel network for simultaneous speech dereverberation, enhancement and separation (DESNet). To enable gradient propagation and joint optimization, we adopt the attentional selection mechanism of the multi-channel features, which is originally proposed in end-to-end unmixing, fixed-beamforming and extraction (E2E-UFE) structure. Furthermore, the novel deep complex convolutional recurrent network (DCCRN) is used as the structure of the speech unmixing and the neural network based weighted prediction error (WPE) is cascaded before-hand for speech dereverberation. We also introduce the staged SNR strategy and symphonic loss for the training of the network to further improve the final performance. Experiments show that in non-dereverberated case, the proposed DESNet outperforms DCCRN and most state-of-the-art structures in speech enhancement and separation, while in dereverberated scenario, DESNet also shows improvements over the cascaded WPE-DCCRN networks.

源语言英语
主期刊名2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
857-864
页数8
ISBN(电子版)9781728170664
DOI
出版状态已出版 - 19 1月 2021
活动2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Virtual, Shenzhen, 中国
期限: 19 1月 202122 1月 2021

出版系列

姓名2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings

会议

会议2021 IEEE Spoken Language Technology Workshop, SLT 2021
国家/地区中国
Virtual, Shenzhen
时期19/01/2122/01/21

指纹

探究 'DESNet: A Multi-Channel Network for Simultaneous Speech Dereverberation, Enhancement and Separation' 的科研主题。它们共同构成独一无二的指纹。

引用此