The NNI Vietnamese speech recognition system for MediaEval 2016

Lei Wang, Chongjia Ni, Cheung Chi Leung, Changhuai You, Lei Xie, Haihua Xu, Xiong Xiao, Tin Lay Nwe, Eng Siong Chng, Bin Ma, Haizhou Li

科研成果: 期刊稿件会议文章同行评审

摘要

This paper provides an overall description of the Vietnamese speech recognition system developed by the joint team for MediaEval 2016. The submitted system consisted of 3 sub-systems, and adopted different deep neural network-based techniques such as fMLLR transformed bottleneck features, sequence training, etc. Besides the acoustic modeling techniques, speech data augmentation was also examined to develop a more robust acoustic model. The I2R team collected a number of text resources from the Internet and made them available to other participants in the task. The web text crawled from the Internet was used to train a 5-gram language model. The submitted system obtained the token error rate (TER) of 15.1, 23.0 and 50.5 on Devel local set, Devel set and Test set, respectively.

源语言英语
期刊CEUR Workshop Proceedings
1739
出版状态已出版 - 2016
活动2016 Multimedia Benchmark Workshop, MediaEval 2016 - Hilversum, 荷兰
期限: 20 10月 201621 10月 2016

指纹

探究 'The NNI Vietnamese speech recognition system for MediaEval 2016' 的科研主题。它们共同构成独一无二的指纹。

引用此