DCCRN: Deep complex convolution recurrent network for phase-aware speech enhancement

Yanxin Hu, Yun Liu, Shubo Lv, Mengtao Xing, Shimin Zhang, Yihui Fu, Jian Wu, Bihong Zhang, Lei Xie

科研成果: 书/报告/会议事项章节会议稿件同行评审

421 引用 (Scopus)

摘要

Speech enhancement has benefited from the success of deep learning in terms of intelligibility and perceptual quality. Conventional time-frequency (TF) domain methods focus on predicting TF-masks or speech spectrum, via a naive convolution neural network (CNN) or recurrent neural network (RNN). Some recent studies use complex-valued spectrogram as a training target but train in a real-valued network, predicting the magnitude and phase component or real and imaginary part, respectively. Particularly, convolution recurrent network (CRN) integrates a convolutional encoder-decoder (CED) structure and long short-term memory (LSTM), which has been proven to be helpful for complex targets. In order to train the complex target more effectively, in this paper, we design a new network structure simulating the complex-valued operation, called Deep Complex Convolution Recurrent Network (DCCRN), where both CNN and RNN structures can handle complex-valued operation. The proposed DCCRN models are very competitive over other previous networks, either on objective or subjective metric. With only 3.7M parameters, our DCCRN models submitted to the Interspeech 2020 Deep Noise Suppression (DNS) challenge ranked first for the real-time-track and second for the non-real-time track in terms of Mean Opinion Score (MOS).

源语言英语
主期刊名Interspeech 2020
出版商International Speech Communication Association
2472-2476
页数5
ISBN(印刷版)9781713820697
DOI
出版状态已出版 - 2020
活动21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 - Shanghai, 中国
期限: 25 10月 202029 10月 2020

出版系列

姓名Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2020-October
ISSN(印刷版)2308-457X
ISSN(电子版)1990-9772

会议

会议21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020
国家/地区中国
Shanghai
时期25/10/2029/10/20

指纹

探究 'DCCRN: Deep complex convolution recurrent network for phase-aware speech enhancement' 的科研主题。它们共同构成独一无二的指纹。

引用此