A robust dynamic mouth feature based on Visemic LDA for audio visual speech recognition

Lei Xie, Zhong Hua Fu, Dong Mei Jiang, Rong Chun Zhao, Werner Verhelst, Hichem Sahli, Jan Conlenis

Research output: Contribution to journalArticlepeer-review

Abstract

This paper presents a robust visual feature based on Visemic LDA for audio visual speech recognition, which captures dynamic lip contour information and reflects the viseme classes of visual speech. The paper also introduces an automatic labeling method using the speech recognition results for LDA training data, which avoids the tedious manually labeling work and labeling errors. Experimental results show that the audio visual speech recognition system based on the visual features presented in this paper can greatly increase the speech recognition rate in noisy conditions. The combination of the visual feature with multi-stream HMM can bring the recognition rate of over 80% at a 10 dB SNR noisy condition.

Original languageEnglish
Pages (from-to)64-68
Number of pages5
JournalDianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology
Volume27
Issue number1
StatePublished - Jan 2005

Keywords

  • ASM
  • Audio visual speech recognition
  • Linear Discriminant Analysis (LDA)
  • Speech recognition
  • Viseme

Fingerprint

Dive into the research topics of 'A robust dynamic mouth feature based on Visemic LDA for audio visual speech recognition'. Together they form a unique fingerprint.

Cite this