U2-KWS: Unified Two-Pass Open-Vocabulary Keyword Spotting with Keyword Bias

Ao Zhang, Pan Zhou, Kaixun Huang, Yong Zou, Ming Liu, Lei Xie

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Open-vocabulary keyword spotting (KWS), which allows users to customize keywords, has attracted increasingly more interest. However, existing methods based on acoustic models and post-processing train the acoustic model with ASR training criteria to model all phonemes, making the acoustic model under-optimized for the KWS task. To solve this problem, we propose a novel unified two-pass open-vocabulary KWS (U2-KWS) framework inspired by the two-pass ASR model U2. Specifically, we employ the CTC branch as the first stage model to detect potential keyword candidates and the decoder branch as the second stage model to validate candidates. In order to enhance any customized keywords, we redesign the U2 training procedure for U2-KWS and add keyword information by audio and text cross-attention into both branches. We perform experiments on our internal dataset and Aishell-1. The results show that U2-KWS can achieve a significant relative wake-up rate improvement of 41 % compared to the traditional customized KWS systems when the false alarm rate is fixed to 0.5 times per hour.

Original languageEnglish
Title of host publication2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350306897
DOIs
StatePublished - 2023
Event2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023 - Taipei, Taiwan, Province of China
Duration: 16 Dec 202320 Dec 2023

Publication series

Name2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023

Conference

Conference2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023
Country/TerritoryTaiwan, Province of China
CityTaipei
Period16/12/2320/12/23

Keywords

  • customized keyword bias
  • multi-task learning
  • Open-vocabulary keyword spotting
  • U2-KWS

Fingerprint

Dive into the research topics of 'U2-KWS: Unified Two-Pass Open-Vocabulary Keyword Spotting with Keyword Bias'. Together they form a unique fingerprint.

Cite this