TY - JOUR
T1 - Improved speaker-dependent separation for Chime-5 challenge
AU - Wu, Jian
AU - Xu, Yong
AU - Zhang, Shi Xiong
AU - Chen, Lian Wu
AU - Yu, Meng
AU - Xie, Lei
AU - Yu, Dong
N1 - Publisher Copyright:
Copyright © 2019 ISCA
PY - 2019
Y1 - 2019
N2 - This paper summarizes several contributions for improving the speaker-dependent separation system for CHiME-5 challenge, which aims to solve the problem of multi-channel, highly-overlapped conversational speech recognition in a dinner party scenario with reverberations and non-stationary noises. Specifically, we adopt a speaker-aware training method by using i-vector as the target speaker information for multi-talker speech separation. With only one unified separation model for all speakers, we achieve a 10% absolute improvement in terms of word error rate (WER) over the previous baseline of 80.28% on the development set by leveraging our newly proposed data processing techniques and beamforming approach. With our improved back-end acoustic model, we further reduce WER to 60.15% which surpasses the result of our submitted CHiME-5 challenge system without applying any fusion techniques.
AB - This paper summarizes several contributions for improving the speaker-dependent separation system for CHiME-5 challenge, which aims to solve the problem of multi-channel, highly-overlapped conversational speech recognition in a dinner party scenario with reverberations and non-stationary noises. Specifically, we adopt a speaker-aware training method by using i-vector as the target speaker information for multi-talker speech separation. With only one unified separation model for all speakers, we achieve a 10% absolute improvement in terms of word error rate (WER) over the previous baseline of 80.28% on the development set by leveraging our newly proposed data processing techniques and beamforming approach. With our improved back-end acoustic model, we further reduce WER to 60.15% which surpasses the result of our submitted CHiME-5 challenge system without applying any fusion techniques.
KW - Beamforming
KW - CHiME-5 challenge
KW - Robust speech recognition
KW - Speaker-dependent speech separation
KW - Speech enhancement
UR - http://www.scopus.com/inward/record.url?scp=85074709833&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2019-1569
DO - 10.21437/Interspeech.2019-1569
M3 - 会议文章
AN - SCOPUS:85074709833
SN - 2308-457X
VL - 2019-September
SP - 466
EP - 470
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019
Y2 - 15 September 2019 through 19 September 2019
ER -