Real-time speech driven talking avatar

Bingfeng Li, Lei Xie, Xiangzeng Zhou, Zhonghua Fu, Yanning Zhang

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

This paper presents a real-time speech driven talking avatar. Unlike most talking avatars in which the speech-synchronized facial animation is generated offline, this talking avatar is able to speak with live speech input. This life-like talking avatar has many potential applications in videophones, virtual conferences, audio/video chats and entertainment. Since phonemes are the smallest units of pronunciation, a real-time phoneme recognizer was built. The synchronization between the input live speech and the facial motion used a phoneme recognition and output algorithm. The coarticulation effects are included in a dynamic viseme generation algorithm to coordinate the facial animation parameters (FAPs) from the input phonemes. The MPEG-4 compliant avatar model is driven by the generated FAPs. Tests show that the avatar motion is synchronized and natural with MOS values of 3.42 and 3.5.

Original languageEnglish
Pages (from-to)1180-1186
Number of pages7
JournalQinghua Daxue Xuebao/Journal of Tsinghua University
Volume51
Issue number9
StatePublished - Sep 2011

Keywords

  • Facial animation
  • Talking avatar
  • Visual speech synthesis

Fingerprint

Dive into the research topics of 'Real-time speech driven talking avatar'. Together they form a unique fingerprint.

Cite this