Lip assistant: Visualize speech for hearing impaired people in multimedia services

Lei Xie, Yi Wang, Zhi Qiang Liu

科研成果: 书/报告/会议事项章节会议稿件同行评审

4 引用 (Scopus)

摘要

This paper presents a very low bit rate speech-to-video synthesizer, named lip assistant, to help hearing impaired people to better access multimedia services via lipreading. Lip assistant can automatically convert acoustic speech to lip parameters with a bit rate of 2.2kbps, and decode them to video-realistic mouth animation on the fly. We use multi-stream HMMs (MSHMMs) and the principal component analysis (PCA) to model the audio-visual speech and the visual articulations, which are learned from AV facial recordings. Speech is converted to lip parameters with natural dynamics by an expectation maximization (EM)-based audio-to-lip converter. The video synthesizer generates video-realistic mouth animations from the encoded lip parameters via PCA expansion. Finally, mouth animation is superimposed on the original video as an assistant for hearing impaired viewers to make a better understanding on the audio-visual contents. Experimental results shows that lip assistant can significantly improve the speech intelligibility of both machines and humans.

源语言英语
主期刊名2006 IEEE International Conference on Systems, Man and Cybernetics
出版商Institute of Electrical and Electronics Engineers Inc.
4331-4336
页数6
ISBN(印刷版)1424401003, 9781424401000
DOI
出版状态已出版 - 2006
已对外发布
活动2006 IEEE International Conference on Systems, Man and Cybernetics - Taipei, 中国台湾
期限: 8 10月 200611 10月 2006

出版系列

姓名Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
5
ISSN(印刷版)1062-922X

会议

会议2006 IEEE International Conference on Systems, Man and Cybernetics
国家/地区中国台湾
Taipei
时期8/10/0611/10/06

指纹

探究 'Lip assistant: Visualize speech for hearing impaired people in multimedia services' 的科研主题。它们共同构成独一无二的指纹。

引用此