Controlling Expressivity using Input Codes in Neural Network based TTS

Xiaolian Zhu, Lei Xie, Xiao Chen, Xiaoyan Lou, Xuan Zhu, Xingjun Tan

科研成果: 书/报告/会议事项章节会议稿件同行评审

2 引用 (Scopus)

摘要

This paper presents a study on the use of input codes in the neural network acoustic modeling for expressive TTS. Specifically, we use different kinds of input codes, augmented with the linguistic features, as the input of a BLSTM-based acoustic model, to control the expressivity of the synthesized speech. The input codes, in one-hot representation, include dialogue code, sentiment code and sentence position code. The dialogue code indicates whether the text is a dialogue or narration in an audiobook story. The sentiment code is obtained from a sentiment analysis tool, which labels each sentence as positive, negative and neutral. The sentence position code indicates the position of the sentence in the paragraph. We believe these codes are highly related to the expressiveness of the audiobook speech. Experiments on the data from the Blizzard Challenge 2017 demonstrate the effectiveness of the use of input codes in the neural network approach for expressive TTS.

源语言英语
主期刊名2018 1st Asian Conference on Affective Computing and Intelligent Interaction, ACII Asia 2018
出版商Institute of Electrical and Electronics Engineers Inc.
ISBN(电子版)9781538653111
DOI
出版状态已出版 - 21 9月 2018
活动1st Asian Conference on Affective Computing and Intelligent Interaction, ACII Asia 2018 - Beijing, 中国
期限: 20 5月 201822 5月 2018

出版系列

姓名2018 1st Asian Conference on Affective Computing and Intelligent Interaction, ACII Asia 2018

会议

会议1st Asian Conference on Affective Computing and Intelligent Interaction, ACII Asia 2018
国家/地区中国
Beijing
时期20/05/1822/05/18

指纹

探究 'Controlling Expressivity using Input Codes in Neural Network based TTS' 的科研主题。它们共同构成独一无二的指纹。

引用此