Speech animation using coupled hidden Markov models

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

14 Scopus citations

Abstract

We present a novel speech animation approach using coupled hidden Markov models (CHMMs). Different from the conventional HMMs that use a single state chain to model the audio-visual speech with tight inter-modal synchronization, we use the CHMMs to model the asynchrony, different discriminative abilities, and temporal coupling between the audio speech and the visual speech, which are important factors for animations looking natural. Based on the audio-visual CHMMs, visual animation parameters are predicted from audio through an EM-based audio to visual conversion algorithm. Experiments on the JEWEL AV database show that compared with the conventional HMMs, the CHMMs can output visual parameters that are much closer to the actual ones. Explicit modelling of audio-visual speech is promising in speech animation.

Original languageEnglish
Title of host publicationProceedings - 18th International Conference on Pattern Recognition, ICPR 2006
Pages1128-1131
Number of pages4
DOIs
StatePublished - 2006
Externally publishedYes
Event18th International Conference on Pattern Recognition, ICPR 2006 - Hong Kong, China
Duration: 20 Aug 200624 Aug 2006

Publication series

NameProceedings - International Conference on Pattern Recognition
Volume1
ISSN (Print)1051-4651

Conference

Conference18th International Conference on Pattern Recognition, ICPR 2006
Country/TerritoryChina
CityHong Kong
Period20/08/0624/08/06

Cite this