Study of semi-supervised approaches to improving english-Mandarin code-switching speech recognition

Pengcheng Guo, Haihua Xu, Lei Xie, Eng Siong Chng

Research output: Contribution to journalConference articlepeer-review

30 Scopus citations

Abstract

In this paper, we present our efforts to improve the performance of a code-switching speech recognition system using semi-supervised training methods from lexicon learning to acoustic modeling, on the South East Asian Mandarin-English (SEAME) data. We first investigate semi-supervised lexicon learning approach to adapt the canonical lexicon, which is meant to alleviate the heavily accented pronunciation issue within the code-switching conversation of the local area. As a result, the learned lexicon yields improved performance. Furthermore, we attempt to use semi-supervised training to deal with those transcriptions that are highly mismatched between the human transcribers and the ASR system. Specifically, we conduct semi-supervised training assuming those poorly transcribed data as unsupervised data. We found the semi-supervised acoustic modeling can lead to improved results. Finally, to make up for the limitation of the conventional n-gram language models due to the data sparsity issue, we perform lattice rescoring using neural network language models, and significant WER reduction is obtained.

Original languageEnglish
Pages (from-to)1928-1932
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2018-September
DOIs
StatePublished - 2018
Event19th Annual Conference of the International Speech Communication, INTERSPEECH 2018 - Hyderabad, India
Duration: 2 Sep 20186 Sep 2018

Keywords

  • Code-switching
  • Lattice rescoring
  • Lexicon learning
  • Semi-supervised training
  • Speech recognition

Fingerprint

Dive into the research topics of 'Study of semi-supervised approaches to improving english-Mandarin code-switching speech recognition'. Together they form a unique fingerprint.

Cite this