Unsupervised Bottleneck features for low-resource query-by-example spoken term detection

Hongjie Chen, Cheung Chi Leung, Lei Xie, Bin Ma, Haizhou Li

科研成果: 期刊稿件会议文章同行评审

40 引用 (Scopus)

摘要

We propose a framework which ports Dirichlet Gaussian mixture model (DPGMM) based labels to deep neural network (DNN). The DNN trained using the unsupervised labels is used to extract a low-dimensional unsupervised speech representation, named as unsupervised bottleneck features (uBNFs), which capture considerable information for sound cluster discrimination. We investigate the performance of uBNF in queryby-example spoken term detection (QbE-STD) on the TIMIT English speech corpus. Our uBNF performs comparably with the cross-lingual bottleneck features (BNFs) extracted from a DNN trained using 171 hours of transcribed telephone speech in another language (Mandarin Chinese). With the score fusion of uBNFs and cross-lingual BNFs, we gain about 10% relative improvement in terms of mean average precision (MAP) comparing with the cross-lingual BNFs. We also study the performance of the framework with different input features and different lengths of temporal context.

源语言英语
页(从-至)923-927
页数5
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
08-12-September-2016
DOI
出版状态已出版 - 2016
活动17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 - San Francisco, 美国
期限: 8 9月 201616 9月 2016

指纹

探究 'Unsupervised Bottleneck features for low-resource query-by-example spoken term detection' 的科研主题。它们共同构成独一无二的指纹。

引用此