Domain Adversarial Training for Improving Keyword Spotting Performance of ESL Speech

Jingyong Hou, Pengcheng Guo, Sining Sun, Frank K. Soong, Wenping Hu, Lei Xie

科研成果: 书/报告/会议事项章节会议稿件同行评审

10 引用 (Scopus)

摘要

A second language (L2) learner usually cannot speak L2 well in both pronunciations and forming-of-words. Hence his/her L2 speech cannot be well recognized by a recognizer trained with native data. Domain adversarial training (DAT), capable of reducing the acoustic mismatch between training and testing, can be useful for improving speech recognition of L2 learners. To get around the ungrammatical L2 speech in scenario-based conversation training, keyword spotting (KWS) is an effective solution by relaxing the language model constraint in decoding. On the acoustic pronunciation side, DAT is investigated in this study for training a neural net-based acoustic model. DAT model is trained with both native and English as second language (ESL) learners' speech to extract more invariant features from native to ESL speech by equalizing their intrinsic difference. The model is jointly optimized for improved senone classification in training. Testing on ESL learners' speech and native English, the DAT model improves recognition performance which is comparable to jointly trained multi-condition model but significantly improves the performance of native speech recognition. In KWS, DAT shows a consistent better performance than the multi-condition training. The improved performance of proposed model is also obtained without increasing its computation complexity or the model size.

源语言英语
主期刊名2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
8122-8126
页数5
ISBN(电子版)9781479981311
DOI
出版状态已出版 - 5月 2019
活动44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, 英国
期限: 12 5月 201917 5月 2019

出版系列

姓名ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2019-May
ISSN(印刷版)1520-6149

会议

会议44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
国家/地区英国
Brighton
时期12/05/1917/05/19

指纹

探究 'Domain Adversarial Training for Improving Keyword Spotting Performance of ESL Speech' 的科研主题。它们共同构成独一无二的指纹。

引用此