DCCRN-KWS: An Audio Bias Based Model for Noise Robust Small-Footprint Keyword Spotting

Shubo Lv, Xiong Wang, Sining Sun, Long Ma, Lei Xie

科研成果: 期刊稿件会议文章同行评审

4 引用 (Scopus)

摘要

Real-world complex acoustic environments especially the ones with a low signal-to-noise ratio (SNR) will bring tremendous challenges to a keyword spotting (KWS) system. Inspired by the recent advances of neural speech enhancement and context bias in speech recognition, we propose a robust audio context bias based DCCRN-KWS model to address this challenge. We form the whole architecture as a multi-task learning framework for both denoising and keyword spotting, where the DCCRN encoder is connected with the KWS model. Helped with the denoising task, we further introduce an audio context bias module to leverage the real keyword samples and bias the network to better discriminate keywords in noisy conditions. Feature merge and complex context linear modules are also introduced to strengthen such discrimination and to effectively leverage contextual information respectively. Experiments on an internal challenging dataset and the HIMIYA public dataset show that DCCRN-KWS is superior in performance, while the ablation study demonstrates the good design of the whole model.

源语言英语
页(从-至)929-933
页数5
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2023-August
DOI
出版状态已出版 - 2023
活动24th International Speech Communication Association, Interspeech 2023 - Dublin, 爱尔兰
期限: 20 8月 202324 8月 2023

指纹

探究 'DCCRN-KWS: An Audio Bias Based Model for Noise Robust Small-Footprint Keyword Spotting' 的科研主题。它们共同构成独一无二的指纹。

引用此