Triseme decision trees in the continuous speech recognition system for talking head animation

Xie Lei; Zhao Rongchun; Jiang Dongmei; Cravyse Ilse; Sahli Hichem; Conlenis Jan

Triseme decision trees in the continuous speech recognition system for talking head animation

Xie Lei, Zhao Rongchun, Jiang Dongmei, Cravyse Ilse, Sahli Hichem, Conlenis Jan

School of Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

Viseme is an audio-visual model for speech-driven talking head animation. In this paper, a viseme HMM based speech recogntion system is built to drive a talking head. Triseme is used to take mouth shape contextual information into account to achieve accurate models. As models mushroomed, to get robust models using the limited training data, decision tree based state tying is adopted in the triseme modeling. Similarity of mouth shapes (SMS) is brought forward to design visual question set in the tree building process. Experimental results show that SMS is a good measurement of mouth shape contexts. Decision tree is a feasible way to get robust model parameter estimations.

Original language	English
Title of host publication	Proceedings of the International Conference on Active Media Technology
Editors	J.P. Li, J. Liu, N. Zhong, J. Yen, J. Zhao, J.P. Li, J. Liu, N. Zhong, J. Yen, J. Zhao
Pages	389-395
Number of pages	7
State	Published - 2003
Event	Proceedings of the Second International Conference on Active Media Technology - Chongqing, China Duration: 29 May 2003 → 31 May 2003

Publication series

Name	Proceedings of the International Conference on Active Media Technology

Conference

Conference	Proceedings of the Second International Conference on Active Media Technology
Country/Territory	China
City	Chongqing
Period	29/05/03 → 31/05/03

Cite this

Lei, X., Rongchun, Z., Dongmei, J., Ilse, C., Hichem, S., & Jan, C. (2003). Triseme decision trees in the continuous speech recognition system for talking head animation. In J. P. Li, J. Liu, N. Zhong, J. Yen, J. Zhao, J. P. Li, J. Liu, N. Zhong, J. Yen, & J. Zhao (Eds.), Proceedings of the International Conference on Active Media Technology (pp. 389-395). (Proceedings of the International Conference on Active Media Technology).

Lei, Xie ; Rongchun, Zhao ; Dongmei, Jiang et al. / Triseme decision trees in the continuous speech recognition system for talking head animation. Proceedings of the International Conference on Active Media Technology. editor / J.P. Li ; J. Liu ; N. Zhong ; J. Yen ; J. Zhao ; J.P. Li ; J. Liu ; N. Zhong ; J. Yen ; J. Zhao. 2003. pp. 389-395 (Proceedings of the International Conference on Active Media Technology).

@inproceedings{8006cde40b94447d800105ff35a84a9a,

title = "Triseme decision trees in the continuous speech recognition system for talking head animation",

abstract = "Viseme is an audio-visual model for speech-driven talking head animation. In this paper, a viseme HMM based speech recogntion system is built to drive a talking head. Triseme is used to take mouth shape contextual information into account to achieve accurate models. As models mushroomed, to get robust models using the limited training data, decision tree based state tying is adopted in the triseme modeling. Similarity of mouth shapes (SMS) is brought forward to design visual question set in the tree building process. Experimental results show that SMS is a good measurement of mouth shape contexts. Decision tree is a feasible way to get robust model parameter estimations.",

author = "Xie Lei and Zhao Rongchun and Jiang Dongmei and Cravyse Ilse and Sahli Hichem and Conlenis Jan",

year = "2003",

language = "英语",

isbn = "9812383433",

series = "Proceedings of the International Conference on Active Media Technology",

pages = "389--395",

editor = "J.P. Li and J. Liu and N. Zhong and J. Yen and J. Zhao and J.P. Li and J. Liu and N. Zhong and J. Yen and J. Zhao",

booktitle = "Proceedings of the International Conference on Active Media Technology",

note = "Proceedings of the Second International Conference on Active Media Technology ; Conference date: 29-05-2003 Through 31-05-2003",

}

Lei, X, Rongchun, Z, Dongmei, J, Ilse, C, Hichem, S & Jan, C 2003, Triseme decision trees in the continuous speech recognition system for talking head animation. in JP Li, J Liu, N Zhong, J Yen, J Zhao, JP Li, J Liu, N Zhong, J Yen & J Zhao (eds), Proceedings of the International Conference on Active Media Technology. Proceedings of the International Conference on Active Media Technology, pp. 389-395, Proceedings of the Second International Conference on Active Media Technology, Chongqing, China, 29/05/03.

Triseme decision trees in the continuous speech recognition system for talking head animation. / Lei, Xie; Rongchun, Zhao; Dongmei, Jiang et al.
Proceedings of the International Conference on Active Media Technology. ed. / J.P. Li; J. Liu; N. Zhong; J. Yen; J. Zhao; J.P. Li; J. Liu; N. Zhong; J. Yen; J. Zhao. 2003. p. 389-395 (Proceedings of the International Conference on Active Media Technology).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Triseme decision trees in the continuous speech recognition system for talking head animation

AU - Lei, Xie

AU - Rongchun, Zhao

AU - Dongmei, Jiang

AU - Ilse, Cravyse

AU - Hichem, Sahli

AU - Jan, Conlenis

PY - 2003

Y1 - 2003

N2 - Viseme is an audio-visual model for speech-driven talking head animation. In this paper, a viseme HMM based speech recogntion system is built to drive a talking head. Triseme is used to take mouth shape contextual information into account to achieve accurate models. As models mushroomed, to get robust models using the limited training data, decision tree based state tying is adopted in the triseme modeling. Similarity of mouth shapes (SMS) is brought forward to design visual question set in the tree building process. Experimental results show that SMS is a good measurement of mouth shape contexts. Decision tree is a feasible way to get robust model parameter estimations.

AB - Viseme is an audio-visual model for speech-driven talking head animation. In this paper, a viseme HMM based speech recogntion system is built to drive a talking head. Triseme is used to take mouth shape contextual information into account to achieve accurate models. As models mushroomed, to get robust models using the limited training data, decision tree based state tying is adopted in the triseme modeling. Similarity of mouth shapes (SMS) is brought forward to design visual question set in the tree building process. Experimental results show that SMS is a good measurement of mouth shape contexts. Decision tree is a feasible way to get robust model parameter estimations.

UR - http://www.scopus.com/inward/record.url?scp=0141929608&partnerID=8YFLogxK

M3 - 会议稿件

AN - SCOPUS:0141929608

SN - 9812383433

T3 - Proceedings of the International Conference on Active Media Technology

SP - 389

EP - 395

BT - Proceedings of the International Conference on Active Media Technology

A2 - Li, J.P.

A2 - Liu, J.

A2 - Zhong, N.

A2 - Yen, J.

A2 - Zhao, J.

A2 - Li, J.P.

A2 - Liu, J.

A2 - Zhong, N.

A2 - Yen, J.

A2 - Zhao, J.

T2 - Proceedings of the Second International Conference on Active Media Technology

Y2 - 29 May 2003 through 31 May 2003

ER -

Lei X, Rongchun Z, Dongmei J, Ilse C, Hichem S, Jan C. Triseme decision trees in the continuous speech recognition system for talking head animation. In Li JP, Liu J, Zhong N, Yen J, Zhao J, Li JP, Liu J, Zhong N, Yen J, Zhao J, editors, Proceedings of the International Conference on Active Media Technology. 2003. p. 389-395. (Proceedings of the International Conference on Active Media Technology).

Triseme decision trees in the continuous speech recognition system for talking head animation

Abstract

Publication series

Conference

Other files and links

Fingerprint

Cite this