跳到主要导航 跳到搜索 跳到主要内容

Controlling Emotion Strength with Relative Attribute for End-To-End Speech Synthesis

  • Xiaolian Zhu
  • , Shan Yang
  • , Geng Yang
  • , Lei Xie
  • Northwestern Polytechnical University Xian
  • Hebei University of Economics and Business

科研成果: 书/报告/会议事项章节会议稿件同行评审

54 引用 (Scopus)

摘要

Recently, attention-based end-To-end speech synthesis has achieved superior performance compared to traditional speech synthesis models, and several approaches like global style tokens are proposed to explore the style controllability of the end-To-end model. Although the existing methods show good performance in style disentanglement and transfer, it is still unable to control the explicit emotion of generated speech. In this paper, we mainly focus on the subtle control of expressive speech synthesis, where the emotion category and strength can be easily controlled with a discrete emotional vector and a continuous simple scalar, respectively. The continuous strength controller is learned by a ranking function according to the relative attribute measured on an emotion dataset. Our method automatically learns the relationship between low-level acoustic features and high-level subtle emotion strength. Experiments show that our method can effectively improve the controllability for an expressive end-To-end model.

源语言英语
主期刊名2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
192-199
页数8
ISBN(电子版)9781728103068
DOI
出版状态已出版 - 12月 2019
活动2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Singapore, 新加坡
期限: 15 12月 201918 12月 2019

出版系列

姓名2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings

会议

会议2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019
国家/地区新加坡
Singapore
时期15/12/1918/12/19

指纹

探究 'Controlling Emotion Strength with Relative Attribute for End-To-End Speech Synthesis' 的科研主题。它们共同构成独一无二的指纹。

引用此