A front-end speech enhancement system for robust automotive speech recognition

Haikun Wang, Zhongfu Ye, Jingdong Chen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

This paper presents a front-end speech enhancement approach to robust speech recognition in automotive environments. It combines model-based voice activity detection (VAD), relative transfer function (RTF) based generalized sidelobe cancelation, and single-channel post filtering to enhance the speech signal of interest, thereby improving the robustness of speech recognition. First, we choose four typical driving scenarios, which include most of the noise types in automobiles to record training data. The recorded data are then used to train Gaussian mixture models (GMMs) for both speech and noise. The trained GMMs are subsequently used to estimate the speech presence probability on a frame-by-frame basis. This speech presence probability is then served as the basic information for RTF estimation, adaptive beamforming, and post-filtering.Experiments are conducted in real automotive environments and the results show that the developed method can significantly improve the performance of both VAD and automatic speech recognition (ASR).

Original languageEnglish
Title of host publication2018 11th International Symposium on Chinese Spoken Language Processing, ISCSLP 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1-5
Number of pages5
ISBN (Electronic)9781538656273
DOIs
StatePublished - 2 Jul 2018
Event11th International Symposium on Chinese Spoken Language Processing, ISCSLP 2018 - Taipei, Taiwan, Province of China
Duration: 26 Nov 201829 Nov 2018

Publication series

Name2018 11th International Symposium on Chinese Spoken Language Processing, ISCSLP 2018 - Proceedings

Conference

Conference11th International Symposium on Chinese Spoken Language Processing, ISCSLP 2018
Country/TerritoryTaiwan, Province of China
CityTaipei
Period26/11/1829/11/18

Keywords

  • Generalized sidelobe cancellation
  • Microphone array
  • Model-based
  • Relative transfer function estimation
  • Speech enhancement
  • Speech recognition
  • Voice activity detection

Fingerprint

Dive into the research topics of 'A front-end speech enhancement system for robust automotive speech recognition'. Together they form a unique fingerprint.

Cite this