TY - GEN
T1 - WaveNet Factorization with Singular Value Decomposition for Voice Conversion
AU - Du, Hongqiang
AU - Tian, Xiaohai
AU - Xie, Lei
AU - Li, Haizhou
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/12
Y1 - 2019/12
N2 - WaveNet vocoder has seen its great advantage over traditional vocoders in voice quality. However, it usually requires a relatively large amount of speech data to train a speaker-dependent WaveNet vocoder. Therefore, it remains a challenge to build a high-quality WaveNet vocoder for low resource tasks, e.g. voice conversion, where speech samples are limited in real applications. We propose to use singular value decomposition (SVD) to reduce WaveNet parameters while maintaining its output voice quality. Specifically, we apply SVD on dilated convolution layers, and impose semi-orthogonal constraint to improve the performance. Experiments conducted on CMU-ARCTIC database show that as compared with the original WaveNet vocoder, the proposed method maintains similar performance, in terms of both quality and similarity, while using much less training data.
AB - WaveNet vocoder has seen its great advantage over traditional vocoders in voice quality. However, it usually requires a relatively large amount of speech data to train a speaker-dependent WaveNet vocoder. Therefore, it remains a challenge to build a high-quality WaveNet vocoder for low resource tasks, e.g. voice conversion, where speech samples are limited in real applications. We propose to use singular value decomposition (SVD) to reduce WaveNet parameters while maintaining its output voice quality. Specifically, we apply SVD on dilated convolution layers, and impose semi-orthogonal constraint to improve the performance. Experiments conducted on CMU-ARCTIC database show that as compared with the original WaveNet vocoder, the proposed method maintains similar performance, in terms of both quality and similarity, while using much less training data.
KW - Singular Value Decomposition (SVD)
KW - Voice Conversion (VC)
KW - WaveNet
UR - http://www.scopus.com/inward/record.url?scp=85081599248&partnerID=8YFLogxK
U2 - 10.1109/ASRU46091.2019.9003801
DO - 10.1109/ASRU46091.2019.9003801
M3 - 会议稿件
AN - SCOPUS:85081599248
T3 - 2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings
SP - 152
EP - 159
BT - 2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019
Y2 - 15 December 2019 through 18 December 2019
ER -