Predicting articulatory movement from text using deep architecture with stacked bottleneck features

Zhen Wei, Zhizheng Wu, Lei Xie

科研成果: 书/报告/会议事项章节会议稿件同行评审

3 引用 (Scopus)

摘要

Using speech or text to predict articulatory movements can have potential benefits for speech related applications. Many approaches have been proposed to solve the acoustic-to-articulatory inversion problem, which is much more than the exploration for predicting articulatory movements from text. In this paper, we investigate the feasibility of using deep neural network (DNN) for articulartory movement prediction from text. We also combine full-context features, state and phone information with stacked bottleneck features which provide wide linguistic context as network input, to improve the performance of articulatory movements' prediction. We show on the MNGU0 data set that our DNN approach achieves a root mean-squared error (RMSE) of 0.7370 mm, the lowest RMSE reported in the literature. We also confirmed the effectiveness of stacked bottleneck features, which could include important contextual information.

源语言英语
主期刊名2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016
出版商Institute of Electrical and Electronics Engineers Inc.
ISBN(电子版)9789881476821
DOI
出版状态已出版 - 17 1月 2017
已对外发布
活动2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016 - Jeju, 韩国
期限: 13 12月 201616 12月 2016

出版系列

姓名2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016

会议

会议2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2016
国家/地区韩国
Jeju
时期13/12/1616/12/16

指纹

探究 'Predicting articulatory movement from text using deep architecture with stacked bottleneck features' 的科研主题。它们共同构成独一无二的指纹。

引用此