A novel SM-DBN model for large-vocabulary continuous speech recognition and phone segmentation

Guoyun Lu, Dongmei Jiang, Yanning Zhang, Rongchun Zhao, Hichem Sahli

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

A novel SM-DBN (Single-stream Multi-state Dynamic Bayesian Network) model is proposed. It is an augmentation of the Single Stream DBN Phone-shared (SS-DBN-P) model proposed by Bilmes et al[4] whose basic recognition units are words, to which we add an extra level of hidden nodes-states, resulting in the SM-DBN model. In our model, a word is composed of its corresponding phones, a phone is composed of a fixed number of states, and a state is associated with the observation features. Essentially, it is a phone model whose basic recognition units are phones. We perform the recognition and segmentation experiments with both continuous digital speech database and large-vocabulary speech database, with the experimental results given in Tables 1 through 3 in the full paper. The experimental results on large-vocabulary and clean speech environment show preliminarily that the speech recognition rate of SM-DBN model is 13.01% and 35% higher than those of the HMM (Hidden Markov Model) and the SS-DBN-P model respectively, and that its phone segmentation accuracy is respectively 10% and 44% higher than the other two models.

Original languageEnglish
Pages (from-to)173-178
Number of pages6
JournalXibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University
Volume26
Issue number2
StatePublished - Apr 2008

Keywords

  • Continuous speech recognition
  • Phone segmentation
  • Single-stream multi-state dynamic Bayesian network (SM-DBN)

Fingerprint

Dive into the research topics of 'A novel SM-DBN model for large-vocabulary continuous speech recognition and phone segmentation'. Together they form a unique fingerprint.

Cite this