OBJECTIVE DISTANCE MEASURES FOR ASSESSING CONCATENATIVE SPEECH SYNTHESIS

Jing Dong Chen, Nick Campbell

科研成果: 会议稿件论文同行评审

14 引用 (Scopus)

摘要

Several different acoustic transforms of the speech signal are compared for use in the assessment and evaluation of concatenative speech synthesis. The transforms tested include LPC, LSP, MFCC, bispectrum, Mellin transform of the log spectrum, Wigner-Ville distribution (WVD), etc. The computed distances between a synthesised utterance and a naturally spoken version of the same sentence are compared by correlation with perceptually-based scores obtained from a MOS evaluation. The results show that the distances computed using the bispectrum have the highest degree of correlation with the MOS score. Both the RMFCC and the LPC outperform the MFCC and the LPCC. The WVD-based cepstrum is found to behave poorly in this task.

源语言英语
611-614
页数4
DOI
出版状态已出版 - 1999
已对外发布
活动6th European Conference on Speech Communication and Technology, EUROSPEECH 1999 - Budapest, 匈牙利
期限: 5 9月 19999 9月 1999

会议

会议6th European Conference on Speech Communication and Technology, EUROSPEECH 1999
国家/地区匈牙利
Budapest
时期5/09/999/09/99

指纹

探究 'OBJECTIVE DISTANCE MEASURES FOR ASSESSING CONCATENATIVE SPEECH SYNTHESIS' 的科研主题。它们共同构成独一无二的指纹。

引用此