Investigating neural network based query-by-example keyword spotting approach for personalized wake-up word detection in Mandarin Chinese

Jingyong Hou, Lei Xie, Zhonghua Fu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

18 Scopus citations

Abstract

We use query-by-example keyword spotting (QbyE-KWS) approach to solve the personalized wake-up word detection problem for small-footprint, low-computational cost on-device applications. QbyE-KWS takes keywords as templates, and matches the templates across an audio stream via DTW to see if the keyword is included. In this paper, we use neural networks as acoustic models to extract DNN/LSTM phoneme posterior features and LSTM embedding features. Specifically, we investigate the LSTM embedding feature extractor for different modeling units in Mandarin, spanning from phonemes to words. We also study the performances of two popular DTW approaches: S-DTW and SLN-DTW. SLN-DTW manages to accurately and effectively search the keyword in a long audio stream without the segmentation procedure that is used in S-DTW approaches. Our study shows that DNN phoneme posterior plus SLN-DTW approach achieves the highest computation efficiency and the state-of-the-art performance with 78% relative miss rate reduction as compared with the S-DTW approach. Word level LSTM embedding feature shows superior performance as compared with other embedding units.

Original languageEnglish
Title of host publicationProceedings of 2016 10th International Symposium on Chinese Spoken Language Processing, ISCSLP 2016
EditorsHsin-Min Wang, Qingzhi Hou, Yuan Wei, Tan Lee, Jianguo Wei, Lei Xie, Hui Feng, Jianwu Dang, Jianwu Dang
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781509042937
DOIs
StatePublished - 2 May 2017
Event10th International Symposium on Chinese Spoken Language Processing, ISCSLP 2016 - Tianjin, China
Duration: 17 Oct 201620 Oct 2016

Publication series

NameProceedings of 2016 10th International Symposium on Chinese Spoken Language Processing, ISCSLP 2016

Conference

Conference10th International Symposium on Chinese Spoken Language Processing, ISCSLP 2016
Country/TerritoryChina
CityTianjin
Period17/10/1620/10/16

Keywords

  • DNN
  • DTW
  • LSTM
  • Query-by-Example
  • Spotting
  • Wake-up Word Detection

Fingerprint

Dive into the research topics of 'Investigating neural network based query-by-example keyword spotting approach for personalized wake-up word detection in Mandarin Chinese'. Together they form a unique fingerprint.

Cite this