A speech enhancement system for automotive speech recognition with a hybrid voice activity detection method

Haikun Wang, Zhongfu Ye, Jingdong Chen

科研成果: 书/报告/会议事项章节会议稿件同行评审

8 引用 (Scopus)

摘要

This paper presents a front-end speech enhancement approach to robust speech recognition in automotive environments. It combines hybrid voice activity detection (VAD), relative transfer function (RT-F) based generalized sidelobe cancelation, and single-channel post filtering to enhance the speech signal of interest, thereby improving the robustness of speech recognition. First, we choose four typical driving scenarios, which include most of the noise types in automobiles to record training data. The recorded data is then used to train deep neural network models (DNNs) for both speech and noise. The trained DNNs are subsequently used to estimate the speech presence probability on a frame-by-frame basis. This speech presence probability is then combined with the output of an energy-based VAD to form a hybrid VAD, which serves as the basis for the rest components of the speech enhancement system, including RTF estimation, adaptive beamforming, and post-filtering. Experiments are conducted in real automotive environments. The results show that the developed method can significantly improve the performance of both VAD and automatic speech recognition (ASR).

源语言英语
主期刊名16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
456-460
页数5
ISBN(电子版)9781538681510
DOI
出版状态已出版 - 2 11月 2018
活动16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Tokyo, 日本
期限: 17 9月 201820 9月 2018

出版系列

姓名16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings

会议

会议16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018
国家/地区日本
Tokyo
时期17/09/1820/09/18

指纹

探究 'A speech enhancement system for automotive speech recognition with a hybrid voice activity detection method' 的科研主题。它们共同构成独一无二的指纹。

引用此