Triseme decision trees in the continuous speech recognition system for talking head animation

Xie Lei, Zhao Rongchun, Jiang Dongmei, Cravyse Ilse, Sahli Hichem, Conlenis Jan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Viseme is an audio-visual model for speech-driven talking head animation. In this paper, a viseme HMM based speech recogntion system is built to drive a talking head. Triseme is used to take mouth shape contextual information into account to achieve accurate models. As models mushroomed, to get robust models using the limited training data, decision tree based state tying is adopted in the triseme modeling. Similarity of mouth shapes (SMS) is brought forward to design visual question set in the tree building process. Experimental results show that SMS is a good measurement of mouth shape contexts. Decision tree is a feasible way to get robust model parameter estimations.

Original languageEnglish
Title of host publicationProceedings of the International Conference on Active Media Technology
EditorsJ.P. Li, J. Liu, N. Zhong, J. Yen, J. Zhao, J.P. Li, J. Liu, N. Zhong, J. Yen, J. Zhao
Pages389-395
Number of pages7
StatePublished - 2003
EventProceedings of the Second International Conference on Active Media Technology - Chongqing, China
Duration: 29 May 200331 May 2003

Publication series

NameProceedings of the International Conference on Active Media Technology

Conference

ConferenceProceedings of the Second International Conference on Active Media Technology
Country/TerritoryChina
CityChongqing
Period29/05/0331/05/03

Fingerprint

Dive into the research topics of 'Triseme decision trees in the continuous speech recognition system for talking head animation'. Together they form a unique fingerprint.

Cite this