Effective Wavenet Adaptation for Voice Conversion with Limited Data

Hongqiang Du, Xiaohai Tian, Lei Xie, Haizhou Li

科研成果: 书/报告/会议事项章节会议稿件同行评审

4 引用 (Scopus)

摘要

WaveNet has shown its great potential as a direct conversion model in voice conversion. However, due to the model complexity, WaveNet always requires a large amount of training data, which has limited its applications in voice conversion, where training data is scarce. In this paper, we propose a WaveNet adaptation method that effectively reduces the need of adaptation data. We first train a speaker independent WaveNet conversion model with multi-speaker dataset. Adaptation is then applied with limited target speaker's data. Specifically, singular value decomposition (SVD) is applied to dilated convolution layers of WaveNet to reduce the number of parameters, which makes adaptation more effective with limited data. Experiments conducted on CMU-ARCTIC and CSTR-VCTK corpus show that the proposed method outperforms baseline methods in terms of both quality and similarity.

源语言英语
主期刊名2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
7779-7783
页数5
ISBN(电子版)9781509066315
DOI
出版状态已出版 - 5月 2020
活动2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Barcelona, 西班牙
期限: 4 5月 20208 5月 2020

出版系列

姓名ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2020-May
ISSN(印刷版)1520-6149

会议

会议2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020
国家/地区西班牙
Barcelona
时期4/05/208/05/20

指纹

探究 'Effective Wavenet Adaptation for Voice Conversion with Limited Data' 的科研主题。它们共同构成独一无二的指纹。

引用此