Component Fusion: Learning Replaceable Language Model Component for End-to-end Speech Recognition System

Changhao Shan, Chao Weng, Guangsen Wang, Dan Su, Min Luo, Dong Yu, Lei Xie

科研成果: 书/报告/会议事项章节会议稿件同行评审

78 引用 (Scopus)

摘要

Recently, attention-based end-to-end automatic speech recognition system (ASR) has shown promising results. One of the limitations of an attention-based ASR system is that its language model (LM) component has to be implicitly learned from transcribed speech data which prevents one from uti-lizing plenty of text corpora to improve language modeling. In this work, the Component Fusion method is proposed to incorporate externally trained neural network (NN) LM into an attention-based ASR system. During training stage we equip the attention-based system with an additional LM component which is replaced by an externally trained NN LM at decoding stage. Experimental results show that the proposed Component Fusion outperforms two prior LM fusion approaches, i.e., Shallow Fusion and Cold Fusion, in both out-of-domain and in-domain scenarios. Further improvements can be achieved when combining Component and Shallow Fusion.

源语言英语
主期刊名2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
5631-5635
页数5
ISBN(电子版)9781479981311
DOI
出版状态已出版 - 5月 2019
活动44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, 英国
期限: 12 5月 201917 5月 2019

出版系列

姓名ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2019-May
ISSN(印刷版)1520-6149

会议

会议44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
国家/地区英国
Brighton
时期12/05/1917/05/19

指纹

探究 'Component Fusion: Learning Replaceable Language Model Component for End-to-end Speech Recognition System' 的科研主题。它们共同构成独一无二的指纹。

引用此