Abstract
Voice conversion aims to modify the characteristics of one speaker to make it sound like spoken by another speaker without changing the language content. This task has attracted considerable attention and various approaches have been proposed since two decades ago. The evaluation of voice conversion approaches, usually through time-intensive subject listening tests, requires a huge amount of human labor. This paper proposes an automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity. Experimental results show that our automatic evaluation results match the subjective listening results quite well. We further use our strategy to select best converted samples from multiple voice conversion systems and our submission achieves promising results in the voice conversion challenge (VCC2016).
Original language | English |
---|---|
Pages | 44-51 |
Number of pages | 8 |
State | Published - 2016 |
Event | 9th ISCA Speech Synthesis Workshop, SSW 2016 - Sunnyvale, United States Duration: 13 Sep 2016 → 15 Sep 2016 |
Conference
Conference | 9th ISCA Speech Synthesis Workshop, SSW 2016 |
---|---|
Country/Territory | United States |
City | Sunnyvale |
Period | 13/09/16 → 15/09/16 |
Keywords
- objective measures
- speaker similarity score
- speech quality assessment
- subjective listening tests
- Voice conversion