Learning neural network representations using cross-lingual bottleneck features withword-pair information

Yougen Yuan; Cheung Chi Leung; Lei Xie; Bin Ma; Haizhou Li

doi:10.21437/Interspeech.2016-317

Learning neural network representations using cross-lingual bottleneck features withword-pair information

Yougen Yuan, Cheung Chi Leung, Lei Xie, Bin Ma, Haizhou Li

School of Computer Science

Research output: Contribution to journal › Conference article › peer-review

17 Scopus citations

Abstract

We assume that only word pairs identified by human are available in a low-resource target language. The word pairs are parameterized by a bottleneck feature (BNF) extractor that is trained using transcribed data in a high-resource language. The cross-lingual BNFs of the word pairs are used for training another neural network to generate a new feature representation in the target language. Pairwise learning of frame-level and word-level feature representations are investigated. Our proposed feature representations were evaluated in a word discrimination task on the Switchboard telephone speech corpus. Our learned features could bring 27.5% relative improvement over the previously best reported result on the task.

Original language	English
Pages (from-to)	788-792
Number of pages	5
Journal	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume	08-12-September-2016
DOIs	https://doi.org/10.21437/Interspeech.2016-317
State	Published - 2016
Event	17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 - San Francisco, United States Duration: 8 Sep 2016 → 16 Sep 2016

Keywords

Bottleneck features (BNFs)
Feature representations
Low-resource speech processing
Pairwise learning
Siamese network

Access to Document

10.21437/Interspeech.2016-317

Cite this

@article{71c1664bb97f4442bc85bb7c44eb6ee4,

title = "Learning neural network representations using cross-lingual bottleneck features withword-pair information",

abstract = "We assume that only word pairs identified by human are available in a low-resource target language. The word pairs are parameterized by a bottleneck feature (BNF) extractor that is trained using transcribed data in a high-resource language. The cross-lingual BNFs of the word pairs are used for training another neural network to generate a new feature representation in the target language. Pairwise learning of frame-level and word-level feature representations are investigated. Our proposed feature representations were evaluated in a word discrimination task on the Switchboard telephone speech corpus. Our learned features could bring 27.5% relative improvement over the previously best reported result on the task.",

keywords = "Bottleneck features (BNFs), Feature representations, Low-resource speech processing, Pairwise learning, Siamese network",

author = "Yougen Yuan and Leung, {Cheung Chi} and Lei Xie and Bin Ma and Haizhou Li",

note = "Publisher Copyright: Copyright {\textcopyright}2016 ISCA.; 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 ; Conference date: 08-09-2016 Through 16-09-2016",

year = "2016",

doi = "10.21437/Interspeech.2016-317",

language = "英语",

volume = "08-12-September-2016",

pages = "788--792",

journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

issn = "2308-457X",

}

Learning neural network representations using cross-lingual bottleneck features withword-pair information. / Yuan, Yougen; Leung, Cheung Chi; Xie, Lei et al.
In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Vol. 08-12-September-2016, 2016, p. 788-792.

Research output: Contribution to journal › Conference article › peer-review

TY - JOUR

T1 - Learning neural network representations using cross-lingual bottleneck features withword-pair information

AU - Yuan, Yougen

AU - Leung, Cheung Chi

AU - Xie, Lei

AU - Ma, Bin

AU - Li, Haizhou

PY - 2016

Y1 - 2016

N2 - We assume that only word pairs identified by human are available in a low-resource target language. The word pairs are parameterized by a bottleneck feature (BNF) extractor that is trained using transcribed data in a high-resource language. The cross-lingual BNFs of the word pairs are used for training another neural network to generate a new feature representation in the target language. Pairwise learning of frame-level and word-level feature representations are investigated. Our proposed feature representations were evaluated in a word discrimination task on the Switchboard telephone speech corpus. Our learned features could bring 27.5% relative improvement over the previously best reported result on the task.

AB - We assume that only word pairs identified by human are available in a low-resource target language. The word pairs are parameterized by a bottleneck feature (BNF) extractor that is trained using transcribed data in a high-resource language. The cross-lingual BNFs of the word pairs are used for training another neural network to generate a new feature representation in the target language. Pairwise learning of frame-level and word-level feature representations are investigated. Our proposed feature representations were evaluated in a word discrimination task on the Switchboard telephone speech corpus. Our learned features could bring 27.5% relative improvement over the previously best reported result on the task.

KW - Bottleneck features (BNFs)

KW - Feature representations

KW - Low-resource speech processing

KW - Pairwise learning

KW - Siamese network

UR - http://www.scopus.com/inward/record.url?scp=84994228769&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2016-317

DO - 10.21437/Interspeech.2016-317

M3 - 会议文章

AN - SCOPUS:84994228769

SN - 2308-457X

VL - 08-12-September-2016

SP - 788

EP - 792

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

T2 - 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016

Y2 - 8 September 2016 through 16 September 2016

ER -

Learning neural network representations using cross-lingual bottleneck features withword-pair information

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this