DC-TseNet: A dual-channel time-domain speech enhancement network

Yihui Fu, Sining Sun, Lei Xie

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we propose an end-to-end dual-channel time domain speech enhancement approach, named DC-TseNet, for devices with multiple microphones such as mobile phones used in far-filed scenario like teleconferencing. DC-TseNet incorporates a computationally efficient CNN to form a unified encoder-enhancement-decoder structure that learns clean speech directly using multichannel signals. In addition, DC-TseNet is trained from both intra-channel an inter-channel features to express the relevance and difference between the collected signals from the two microphones, which makes sufficient use of spatial information and reduce the influence of recording direction on the signals. The experimental results show that the proposed dual-channel time-domain approach, with more compact model size, significantly outperforms the LSTM-based frequency-domain method. Furthermore, we find that the inter-channel information, especially the difference between two channels, is more important for a better performance gain.

Original languageEnglish
Title of host publication2020 8th International Conference on Orange Technology, ICOT 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665418522
DOIs
StatePublished - 18 Dec 2020
Event8th International Conference on Orange Technology, ICOT 2020 - Daegu, Korea, Republic of
Duration: 18 Dec 202021 Dec 2020

Publication series

Name2020 8th International Conference on Orange Technology, ICOT 2020

Conference

Conference8th International Conference on Orange Technology, ICOT 2020
Country/TerritoryKorea, Republic of
CityDaegu
Period18/12/2021/12/20

Keywords

  • CNN
  • DC-TseNet
  • Dual-channel
  • Time-domain speech enhancement

Fingerprint

Dive into the research topics of 'DC-TseNet: A dual-channel time-domain speech enhancement network'. Together they form a unique fingerprint.

Cite this