Dynamic Bayesian network inversion for robust speech recognition

Lei Xie; Hongwu Yang

doi:10.1093/ietisy/e90-d.7.1117

Dynamic Bayesian network inversion for robust speech recognition

Lei Xie, Hongwu Yang

School of Computer Science

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

This paper presents an inversion algorithm for dynamic Bayesian networks towards robust speech recognition, namely DBNI, which is a generalization of hidden Markov model inversion (HMMI). As a dual procedure of expectation maximization (EM)-based model reestimation, DBNI finds the 'uncontaminated' speech by moving the input noisy speech to the Gaussian means under the maximum likelihood (ML) sense given the DBN models trained on clean speech. This algorithm can provide both the expressive advantage from DBN and the noise-removal feature from model inversion. Experiments on the Aurora 2.0 database show that the hidden feature model (a typical DBN for speech recognition) with the DBNI algorithm achieves superior performance in terms of word error rate reduction.

Original language	English
Pages (from-to)	1117-1120
Number of pages	4
Journal	IEICE Transactions on Information and Systems
Volume	E90-D
Issue number	7
DOIs	https://doi.org/10.1093/ietisy/e90-d.7.1117
State	Published - Jul 2007

Keywords

Dynamic Bayesian network
Hidden Markov model
Speech recognition

Access to Document

10.1093/ietisy/e90-d.7.1117

Cite this

@article{e031be30b2044c1ca3eda5fe40d79d38,

title = "Dynamic Bayesian network inversion for robust speech recognition",

abstract = "This paper presents an inversion algorithm for dynamic Bayesian networks towards robust speech recognition, namely DBNI, which is a generalization of hidden Markov model inversion (HMMI). As a dual procedure of expectation maximization (EM)-based model reestimation, DBNI finds the 'uncontaminated' speech by moving the input noisy speech to the Gaussian means under the maximum likelihood (ML) sense given the DBN models trained on clean speech. This algorithm can provide both the expressive advantage from DBN and the noise-removal feature from model inversion. Experiments on the Aurora 2.0 database show that the hidden feature model (a typical DBN for speech recognition) with the DBNI algorithm achieves superior performance in terms of word error rate reduction.",

keywords = "Dynamic Bayesian network, Hidden Markov model, Speech recognition",

author = "Lei Xie and Hongwu Yang",

year = "2007",

month = jul,

doi = "10.1093/ietisy/e90-d.7.1117",

language = "英语",

volume = "E90-D",

pages = "1117--1120",

journal = "IEICE Transactions on Information and Systems",

issn = "0916-8532",

publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",

number = "7",

}

TY - JOUR

T1 - Dynamic Bayesian network inversion for robust speech recognition

AU - Xie, Lei

AU - Yang, Hongwu

PY - 2007/7

Y1 - 2007/7

N2 - This paper presents an inversion algorithm for dynamic Bayesian networks towards robust speech recognition, namely DBNI, which is a generalization of hidden Markov model inversion (HMMI). As a dual procedure of expectation maximization (EM)-based model reestimation, DBNI finds the 'uncontaminated' speech by moving the input noisy speech to the Gaussian means under the maximum likelihood (ML) sense given the DBN models trained on clean speech. This algorithm can provide both the expressive advantage from DBN and the noise-removal feature from model inversion. Experiments on the Aurora 2.0 database show that the hidden feature model (a typical DBN for speech recognition) with the DBNI algorithm achieves superior performance in terms of word error rate reduction.

AB - This paper presents an inversion algorithm for dynamic Bayesian networks towards robust speech recognition, namely DBNI, which is a generalization of hidden Markov model inversion (HMMI). As a dual procedure of expectation maximization (EM)-based model reestimation, DBNI finds the 'uncontaminated' speech by moving the input noisy speech to the Gaussian means under the maximum likelihood (ML) sense given the DBN models trained on clean speech. This algorithm can provide both the expressive advantage from DBN and the noise-removal feature from model inversion. Experiments on the Aurora 2.0 database show that the hidden feature model (a typical DBN for speech recognition) with the DBNI algorithm achieves superior performance in terms of word error rate reduction.

KW - Dynamic Bayesian network

KW - Hidden Markov model

KW - Speech recognition

UR - http://www.scopus.com/inward/record.url?scp=68249162405&partnerID=8YFLogxK

U2 - 10.1093/ietisy/e90-d.7.1117

DO - 10.1093/ietisy/e90-d.7.1117

M3 - 文章

AN - SCOPUS:68249162405

SN - 0916-8532

VL - E90-D

SP - 1117

EP - 1120

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

IS - 7

ER -

Dynamic Bayesian network inversion for robust speech recognition

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this