Lip assistant: Visualize speech for hearing impaired people in multimedia services

Lei Xie, Yi Wang, Zhi Qiang Liu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

This paper presents a very low bit rate speech-to-video synthesizer, named lip assistant, to help hearing impaired people to better access multimedia services via lipreading. Lip assistant can automatically convert acoustic speech to lip parameters with a bit rate of 2.2kbps, and decode them to video-realistic mouth animation on the fly. We use multi-stream HMMs (MSHMMs) and the principal component analysis (PCA) to model the audio-visual speech and the visual articulations, which are learned from AV facial recordings. Speech is converted to lip parameters with natural dynamics by an expectation maximization (EM)-based audio-to-lip converter. The video synthesizer generates video-realistic mouth animations from the encoded lip parameters via PCA expansion. Finally, mouth animation is superimposed on the original video as an assistant for hearing impaired viewers to make a better understanding on the audio-visual contents. Experimental results shows that lip assistant can significantly improve the speech intelligibility of both machines and humans.

Original languageEnglish
Title of host publication2006 IEEE International Conference on Systems, Man and Cybernetics
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4331-4336
Number of pages6
ISBN (Print)1424401003, 9781424401000
DOIs
StatePublished - 2006
Externally publishedYes
Event2006 IEEE International Conference on Systems, Man and Cybernetics - Taipei, Taiwan, Province of China
Duration: 8 Oct 200611 Oct 2006

Publication series

NameConference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
Volume5
ISSN (Print)1062-922X

Conference

Conference2006 IEEE International Conference on Systems, Man and Cybernetics
Country/TerritoryTaiwan, Province of China
CityTaipei
Period8/10/0611/10/06

Fingerprint

Dive into the research topics of 'Lip assistant: Visualize speech for hearing impaired people in multimedia services'. Together they form a unique fingerprint.

Cite this