Spatial-DCCRN: DCCRN Equipped with Frame-Level Angle Feature and Hybrid Filtering for Multi-Channel Speech Enhancement

Shubo Lv, Yihui Fu, Yukai Jv, Lei Xie, Weixin Zhu, Wei Rao, Yannan Wang

科研成果: 书/报告/会议事项章节会议稿件同行评审

11 引用 (Scopus)

摘要

Recently, multi-channel speech enhancement has drawn much interest due to the use of spatial information to distinguish target speech from interfering signal. To make full use of spatial information and neural network based masking estimation, we propose a multi-channel denoising neural network - Spatial DCCRN. Firstly, we extend S-DCCRN to multi -channel scenario, aiming at performing cascaded sub-channel and full-channel processing strategy, which can model different channels separately. Moreover, instead of only adopting multi-channel spectrum or concatenating first-channel's magnitude and IPD as the model's inputs, we apply an angle feature extraction module (AFE) to extract frame-level angle feature embeddings, which can help the model to apparently perceive spatial information. Finally, since the phenomenon of residual noise will be more serious when the noise and speech exist in the same time frequency (TF) bin, we particularly design a masking and mapping filtering method to substitute the traditional filter-and-sum operation, with the purpose of cascading coarsely denoising, dereverberation and residual noise suppression. The proposed model, Spatial-DCCRN, has surpassed EaBNet, FasNet as well as several competitive models on the L3DAS22 Challenge dataset. Not only the 3D scenario, Spatial-DCCRN outperforms state-of-the-art (SOTA) model MIMO-UNet by a large margin in multiple evaluation metrics on the multi-channel ConferencingSpeech2021 Challenge dataset. Ablation studies also demonstrate the effectiveness of different contributions.

源语言英语
主期刊名2022 IEEE Spoken Language Technology Workshop, SLT 2022 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
436-443
页数8
ISBN(电子版)9798350396904
DOI
出版状态已出版 - 2023
活动2022 IEEE Spoken Language Technology Workshop, SLT 2022 - Doha, 卡塔尔
期限: 9 1月 202312 1月 2023

出版系列

姓名2022 IEEE Spoken Language Technology Workshop, SLT 2022 - Proceedings

会议

会议2022 IEEE Spoken Language Technology Workshop, SLT 2022
国家/地区卡塔尔
Doha
时期9/01/2312/01/23

指纹

探究 'Spatial-DCCRN: DCCRN Equipped with Frame-Level Angle Feature and Hybrid Filtering for Multi-Channel Speech Enhancement' 的科研主题。它们共同构成独一无二的指纹。

引用此