Improved speaker-dependent separation for Chime-5 challenge

Jian Wu, Yong Xu, Shi Xiong Zhang, Lian Wu Chen, Meng Yu, Lei Xie, Dong Yu

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations

Abstract

This paper summarizes several contributions for improving the speaker-dependent separation system for CHiME-5 challenge, which aims to solve the problem of multi-channel, highly-overlapped conversational speech recognition in a dinner party scenario with reverberations and non-stationary noises. Specifically, we adopt a speaker-aware training method by using i-vector as the target speaker information for multi-talker speech separation. With only one unified separation model for all speakers, we achieve a 10% absolute improvement in terms of word error rate (WER) over the previous baseline of 80.28% on the development set by leveraging our newly proposed data processing techniques and beamforming approach. With our improved back-end acoustic model, we further reduce WER to 60.15% which surpasses the result of our submitted CHiME-5 challenge system without applying any fusion techniques.

Keywords

  • Beamforming
  • CHiME-5 challenge
  • Robust speech recognition
  • Speaker-dependent speech separation
  • Speech enhancement

Fingerprint

Dive into the research topics of 'Improved speaker-dependent separation for Chime-5 challenge'. Together they form a unique fingerprint.

Cite this