Binaural localization of speech sources in 3-D using a composite feature vector of the HRTF

Xiang Wu, Dumidu S. Talagala, Wen Zhang, Thushara D. Abhayapala

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

Binaural localization of speech sources in 3-D, using head-related transfer functions (HRTFs), always suffers elevation ambiguity due to the limited high frequency spectral information available at the receivers. This paper presents a method that overcomes this limitation by exploiting the interaural phase and magnitude features present in the HRTF. We (i) introduce a new feature vector that combines these two sets of features in a non-linear fashion, and (ii) propose a mechanism to extract this feature vector free from distortion by the speech spectra. The performance of the proposed method is evaluated and compared with a correlation-based HRTF database matching approach and a two-step localization technique for multiple source positions, HRTFs (individuals) and speech inputs. The results suggest that up to 20% improvement in localization performance can be achieved for moderate signal-to-noise ratios.

Original languageEnglish
Title of host publication2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2654-2658
Number of pages5
ISBN (Electronic)9781467369978
DOIs
StatePublished - 4 Aug 2015
Externally publishedYes
Event40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Brisbane, Australia
Duration: 19 Apr 201424 Apr 2014

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2015-August
ISSN (Print)1520-6149

Conference

Conference40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
Country/TerritoryAustralia
CityBrisbane
Period19/04/1424/04/14

Keywords

  • Binaural localization
  • cepstral transformation
  • generalized cross-correlation (GCC)
  • head related transfer function (HRTF)
  • phase transform (PHAT)

Fingerprint

Dive into the research topics of 'Binaural localization of speech sources in 3-D using a composite feature vector of the HRTF'. Together they form a unique fingerprint.

Cite this