Adaptive stream reliability modeling based on local dispersion measures for audio visual speech recognition

Lei Xie, Rong Chun Zhao, Zhi Qiang Liu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

This paper proposes an adaptive stream reliability modeling technique for audio visual speech recognition (AVSR). As recognition conditions vary locally, we present two local measures - frame and window dispersions to depict the temporal discriminative powers and noise levels of both audio and visual streams. The dispersions are subsequently mapped to stream exponents according to the minimum classification error (MCE) criterion. Experiments on a connected-digits task show that our method consistently outperforms the popular Discriminative Training (DT) and Grid Search (GS) methods at various signal noise ratios (SNRs), improving for example word accuracy rate (WAR) from 94.7% to 96.4% at 28dB SNR.

Original languageEnglish
Title of host publication2005 International Conference on Machine Learning and Cybernetics, ICMLC 2005
Pages4852-4857
Number of pages6
StatePublished - 2005
EventInternational Conference on Machine Learning and Cybernetics, ICMLC 2005 - Guangzhou, China
Duration: 18 Aug 200521 Aug 2005

Publication series

Name2005 International Conference on Machine Learning and Cybernetics, ICMLC 2005

Conference

ConferenceInternational Conference on Machine Learning and Cybernetics, ICMLC 2005
Country/TerritoryChina
CityGuangzhou
Period18/08/0521/08/05

Keywords

  • Audio visual speech recognition
  • Dispersion
  • Lipreading
  • MCE-GPD
  • Stream exponents

Fingerprint

Dive into the research topics of 'Adaptive stream reliability modeling based on local dispersion measures for audio visual speech recognition'. Together they form a unique fingerprint.

Cite this