A STEERED RESPONSE POWER APPROACH WITH BILINEAR PREDICTION-BASED TRADE-OFF PREWHITENING FOR SPEAKER LOCALIZATION

Zhiheng Wang, Hongsen He, Jingdong Chen, Jacob Benesty, Yi Yu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

This paper studies the problem of acoustic source localization in room environments. It presents an improved steered response power (SRP) approach with low-complexity and trade-off prewhitening. This method consists of two steps. In the first one, the linear predictor that is used to model the speech signals is formulated as a bilinear form, and a group of convex-constrained linear prediction sub-models with respect to dual sub-predictors are established to pre-filter microphone signals. The pre-filtered (prewhitened) microphone signals are subsequently used in SRP for speaker localization. Simulation results demonstrate the properties of the presented method: it is robust to reverberation and noise, and is computationally efficient thanks to the bilinear form.

Original languageEnglish
Title of host publication2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1046-1050
Number of pages5
ISBN (Electronic)9798350344851
DOIs
StatePublished - 2024
Event2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Seoul, Korea, Republic of
Duration: 14 Apr 202419 Apr 2024

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024
Country/TerritoryKorea, Republic of
CitySeoul
Period14/04/2419/04/24

Keywords

  • Acoustic source localization
  • bilinear forms
  • linear prediction
  • trade-off prewhitening

Fingerprint

Dive into the research topics of 'A STEERED RESPONSE POWER APPROACH WITH BILINEAR PREDICTION-BASED TRADE-OFF PREWHITENING FOR SPEAKER LOCALIZATION'. Together they form a unique fingerprint.

Cite this