Voice conversion using bayesian analysis and dynamic kernel features

Na Li, Xiangyang Zeng, Yu Qiao, Zhifeng Li

科研成果: 期刊稿件文章同行评审

1 引用 (Scopus)

摘要

When the training utterances are sparse, the voice conversion method based on Mixture of Probabilistic Linear Regressions is subjected to overfitting problem. To address that case, we adopt dynamic kernel features to replace the cepstrum features of the original speaker and estimate the transformation parameters in sense of Maximizing a Posterior with Bayesian inference. First, the features of the original speaker are converted into dynamic kernel features by kernel transformation. Then the prior information of the transformation parameters is introduced. Finally, according to different assumptions about conversion error, we propose two different methods to estimate the transformation parameters. Compared to MPLR, the proposed method achieves 4.25% relative decrease on the average cepstrum distortion in objective evaluations and obtains higher score about naturalness and similarity in subjective evaluations. Experimental results indicate that the proposed method can alleviate the overfitting problem.

源语言英语
页(从-至)455-461
页数7
期刊Shengxue Xuebao/Acta Acustica
40
3
出版状态已出版 - 1 5月 2015

指纹

探究 'Voice conversion using bayesian analysis and dynamic kernel features' 的科研主题。它们共同构成独一无二的指纹。

引用此