Binaural localization of speech sources in 3-D using a composite feature vector of the HRTF

Xiang Wu, Dumidu S. Talagala, Wen Zhang, Thushara D. Abhayapala

科研成果: 书/报告/会议事项章节会议稿件同行评审

7 引用 (Scopus)

摘要

Binaural localization of speech sources in 3-D, using head-related transfer functions (HRTFs), always suffers elevation ambiguity due to the limited high frequency spectral information available at the receivers. This paper presents a method that overcomes this limitation by exploiting the interaural phase and magnitude features present in the HRTF. We (i) introduce a new feature vector that combines these two sets of features in a non-linear fashion, and (ii) propose a mechanism to extract this feature vector free from distortion by the speech spectra. The performance of the proposed method is evaluated and compared with a correlation-based HRTF database matching approach and a two-step localization technique for multiple source positions, HRTFs (individuals) and speech inputs. The results suggest that up to 20% improvement in localization performance can be achieved for moderate signal-to-noise ratios.

源语言英语
主期刊名2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
2654-2658
页数5
ISBN(电子版)9781467369978
DOI
出版状态已出版 - 4 8月 2015
已对外发布
活动40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Brisbane, 澳大利亚
期限: 19 4月 201424 4月 2014

出版系列

姓名ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2015-August
ISSN(印刷版)1520-6149

会议

会议40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
国家/地区澳大利亚
Brisbane
时期19/04/1424/04/14

指纹

探究 'Binaural localization of speech sources in 3-D using a composite feature vector of the HRTF' 的科研主题。它们共同构成独一无二的指纹。

引用此