Multilingual bottle-neck feature learning from untranscribed speech

Hongjie Chen, Cheung Chi Leung, Lei Xie, Bin Ma, Haizhou Li

科研成果: 书/报告/会议事项章节会议稿件同行评审

27 引用 (Scopus)

摘要

We propose to learn a low-dimensional feature representation for multiple languages without access to their manual transcription. The multilingual features are extracted from a shared bottleneck layer of a multi-task learning deep neural network which is trained using un-supervised phoneme-like labels. The unsupervised phoneme-like labels are obtained from language-dependent Dirichlet process Gaussian mixture models (DPGMMs). Vocal tract length normalization (VTLN) is applied to mel-frequency cepstral coefficients to reduce talker variation when DPGMMs are trained. The proposed features are evaluated using the ABX phoneme discriminability test in the Zero Resource Speech Challenge 2017. In the experiments, we show that the proposed features perform well across different languages, and they consistently outperform our previously proposed DPGMM posteriorgrams which topped the performance in the same challenge in 2015.

源语言英语
主期刊名2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
727-733
页数7
ISBN(电子版)9781509047888
DOI
出版状态已出版 - 2 7月 2017
活动2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Okinawa, 日本
期限: 16 12月 201720 12月 2017

出版系列

姓名2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings
2018-January

会议

会议2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017
国家/地区日本
Okinawa
时期16/12/1720/12/17

指纹

探究 'Multilingual bottle-neck feature learning from untranscribed speech' 的科研主题。它们共同构成独一无二的指纹。

引用此