Triseme decision trees in the continuous speech recognition system for a talking head

Dong Mei Jiang, Lei Xie, Ilse Ravyse, Rong Chun Zhao, Hichem Sahli, Jan Cornelis

科研成果: 书/报告/会议事项章节会议稿件同行评审

3 引用 (Scopus)

摘要

In this paper, we present a viseme (the basic speech units in the visual domain) based continuous speech recognition system, which segments speech into viseme sequences with timing boundaries to drive a talking head. In the viseme Hidden Markov Model (HMM) training, the instances of a viseme with different contexts are formulated as trisemes. Based on the mouth shape parameters Liprounding and the defined viseme similarity weight (VSW) from the 3D viseme facial models, 166 questions concerning the viseme contexts are designed to build triseme decision trees to tie the states of the trisemes with similar contexts, so that they can share the same parameters. To evaluate the system performance, the image related measurements are also taken to evaluate the resulting viseme sequences, with 'jerky instances' in Liprounding and VSW graphs evaluating their smoothness. Results show that compared to the phoneme based system, the tied-state triseme based speech recognition system gives talking head animation with smoother and more plausible mouth shapes.

源语言英语
主期刊名Proceedings of 2002 International Conference on Machine Learning and Cybernetics
2097-2101
页数5
出版状态已出版 - 2002
活动Proceedings of 2002 International Conference on Machine Learning and Cybernetics - Beijing, 中国
期限: 4 11月 20025 11月 2002

出版系列

姓名Proceedings of 2002 International Conference on Machine Learning and Cybernetics
4

会议

会议Proceedings of 2002 International Conference on Machine Learning and Cybernetics
国家/地区中国
Beijing
时期4/11/025/11/02

指纹

探究 'Triseme decision trees in the continuous speech recognition system for a talking head' 的科研主题。它们共同构成独一无二的指纹。

引用此