Intrinsic spectral analysis based on temporal context features for query-by-example spoken term detection

Peng Yang, Cheung Chi Leung, Lei Xie, Bin Ma, Haizhou Li

科研成果: 期刊稿件会议文章同行评审

22 引用 (Scopus)

摘要

We investigate the use of intrinsic spectral analysis (ISA) for query-by-example spoken term detection (QbE-STD). In the task, spoken queries and test utterances in an audio archive are converted to ISA features, and dynamic time warping is applied to match the feature sequence in each query with those in test utterances. Motivated by manifold learning, ISA has been pro- posed to recover from untranscribed utterances a set of nonlin- ear basis functions for the speech manifold, and shown with improved phonetic separability and inherent speaker indepen- dence. Due to the coarticulation phenomenon in speech, we propose to use temporal context information to obtain the ISA features. Gaussian posteriorgram, as an efficient acoustic rep- resentation usually used in QbE-STD, is considered a baseline feature. Experimental results on the TIMIT speech corpus show that the ISA features can provide a relative 13.5% improvement in mean average precision over the baseline features, when the temporal context information is used.

指纹

探究 'Intrinsic spectral analysis based on temporal context features for query-by-example spoken term detection' 的科研主题。它们共同构成独一无二的指纹。

引用此