Audio-visual human recognition using semi-supervised spectral learning and hidden Markov models

Wei Feng, Lei Xie, Jia Zeng, Zhi Qiang Liu

Research output: Contribution to journalArticlepeer-review

15 Scopus citations

Abstract

This paper presents a multimodal system for reliable human identity recognition under variant conditions. Our system fuses the recognition of face and speech with a general probabilistic framework. For face recognition, we propose a new spectral learning algorithm, which considers not only the discriminative relations among the training data but also the generative models for each class. Due to the tedious cost of face labeling in practice, our spectral face learning utilizes a semi-supervised strategy. That is, only a small number of labeled faces are used in our training step, and the labels are optimally propagated to other unlabeled training faces. Besides requiring much less labeled data, our algorithm also enables a natural way to explicitly train an outlier model that approximately represents unauthorized faces. To boost the robustness of our system for human recognition under various environments, our face recognition is further complemented by a speaker identification agent. Specifically, this agent models the statistical variations of fixed-phrase speech using speaker-dependent word hidden Markov models. Experiments on benchmark databases validate the effectiveness of our face recognition and speaker identification agents, and demonstrate that the recognition accuracy can be apparently improved by integrating these two independent biometric sources together.

Original languageEnglish
Pages (from-to)188-195
Number of pages8
JournalJournal of Visual Languages and Computing
Volume20
Issue number3
DOIs
StatePublished - Jun 2009

Keywords

  • Face recognition
  • Hidden Markov models (HMMs)
  • Semi-supervised spectral learning
  • Speaker identification

Fingerprint

Dive into the research topics of 'Audio-visual human recognition using semi-supervised spectral learning and hidden Markov models'. Together they form a unique fingerprint.

Cite this