TY - GEN
T1 - Effective Wavenet Adaptation for Voice Conversion with Limited Data
AU - Du, Hongqiang
AU - Tian, Xiaohai
AU - Xie, Lei
AU - Li, Haizhou
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/5
Y1 - 2020/5
N2 - WaveNet has shown its great potential as a direct conversion model in voice conversion. However, due to the model complexity, WaveNet always requires a large amount of training data, which has limited its applications in voice conversion, where training data is scarce. In this paper, we propose a WaveNet adaptation method that effectively reduces the need of adaptation data. We first train a speaker independent WaveNet conversion model with multi-speaker dataset. Adaptation is then applied with limited target speaker's data. Specifically, singular value decomposition (SVD) is applied to dilated convolution layers of WaveNet to reduce the number of parameters, which makes adaptation more effective with limited data. Experiments conducted on CMU-ARCTIC and CSTR-VCTK corpus show that the proposed method outperforms baseline methods in terms of both quality and similarity.
AB - WaveNet has shown its great potential as a direct conversion model in voice conversion. However, due to the model complexity, WaveNet always requires a large amount of training data, which has limited its applications in voice conversion, where training data is scarce. In this paper, we propose a WaveNet adaptation method that effectively reduces the need of adaptation data. We first train a speaker independent WaveNet conversion model with multi-speaker dataset. Adaptation is then applied with limited target speaker's data. Specifically, singular value decomposition (SVD) is applied to dilated convolution layers of WaveNet to reduce the number of parameters, which makes adaptation more effective with limited data. Experiments conducted on CMU-ARCTIC and CSTR-VCTK corpus show that the proposed method outperforms baseline methods in terms of both quality and similarity.
KW - Singular Value Decomposition (SVD)
KW - Voice Conversion (VC)
KW - WaveNet adaptation
UR - http://www.scopus.com/inward/record.url?scp=85089242967&partnerID=8YFLogxK
U2 - 10.1109/ICASSP40776.2020.9053315
DO - 10.1109/ICASSP40776.2020.9053315
M3 - 会议稿件
AN - SCOPUS:85089242967
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 7779
EP - 7783
BT - 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020
Y2 - 4 May 2020 through 8 May 2020
ER -