Fundamental frequency modeling using wavelets for emotional voice conversion

Huaiping Ming, Dongyan Huang, Minghui Dong, Haizhou Li, Lei Xie, Shaofei Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

33 Scopus citations

Abstract

This paper is to show a representation of fundamental frequency (F0) using continuous wavelet transform (CWT) for prosody modeling in emotion conversion. Emotional conversion aims at converting speech from one emotion state to another. Specifically, we use CWT to decompose F0 into a five-scale representation that corresponds to five temporal scales. A neutral voice is converted to an emotional voice under an exemplar-based voice conversion framework, where both spectrum and F0 are simultaneously converted. The simulation results demonstrate that the dynamics of F0 in different temporal scales can be well captured and converted using the five-scale CWT representation. The converted speech signals are evaluated both objectively and subjectively, that confirm the effectiveness of the proposed method.

Original languageEnglish
Title of host publication2015 International Conference on Affective Computing and Intelligent Interaction, ACII 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages804-809
Number of pages6
ISBN (Electronic)9781479999538
DOIs
StatePublished - 2 Dec 2015
Event2015 International Conference on Affective Computing and Intelligent Interaction, ACII 2015 - Xi'an, China
Duration: 21 Sep 201524 Sep 2015

Publication series

Name2015 International Conference on Affective Computing and Intelligent Interaction, ACII 2015

Conference

Conference2015 International Conference on Affective Computing and Intelligent Interaction, ACII 2015
Country/TerritoryChina
CityXi'an
Period21/09/1524/09/15

Keywords

  • emotion
  • prosody
  • sparse representation
  • Voice conversion

Fingerprint

Dive into the research topics of 'Fundamental frequency modeling using wavelets for emotional voice conversion'. Together they form a unique fingerprint.

Cite this