Toward high-performance language-independent query-by-example spoken term detection for MediaEval 2015: Post-evaluation analysis

Cheung Chi Leung, Lei Wang, Haihua Xu, Jingyong Hou, Van Tung Pham, Hang Lv, Lei Xie, Xiong Xiao, Chongjia Ni, Bin Ma, Eng Siong Chng, Haizhou Li

科研成果: 期刊稿件会议文章同行评审

18 引用 (Scopus)

摘要

This paper documents the significant components of a state-ofthe-art language-independent query-by-example spoken term detection system designed for the Query by Example Search on Speech Task (QUESST) in MediaEval 2015. We developed exact and partial matching DTW systems, and WFST based symbolic search systems to handle different types of search queries. To handle the noisy and reverberant speech in the task, we trained tokenizers using data augmented with different noise and reverberation conditions. Our postevaluation analysis showed that the phone boundary label provided by the improved tokenizers brings more accurate speech activity detection in DTW systems. We argue that acoustic condition mismatch is possibly a more important factor than language mismatch for obtaining consistent gain from stacked bottleneck features. Our post-evaluation system, involving a smaller number of component systems, can outperform our submitted systems, which performed the best for the task.

源语言英语
页(从-至)3703-3707
页数5
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
08-12-September-2016
DOI
出版状态已出版 - 2016
活动17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 - San Francisco, 美国
期限: 8 9月 201616 9月 2016

指纹

探究 'Toward high-performance language-independent query-by-example spoken term detection for MediaEval 2015: Post-evaluation analysis' 的科研主题。它们共同构成独一无二的指纹。

引用此