Unsupervised adaptation with adversarial dropout regularization for robust speech recognition

Pengcheng Guo; Sining Sun; Lei Xie

doi:10.21437/Interspeech.2019-2544

Unsupervised adaptation with adversarial dropout regularization for robust speech recognition

Pengcheng Guo, Sining Sun, Lei Xie

计算机学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 会议文章 › 同行评审

4 引用（Scopus）

摘要

Recent adversarial methods proposed for unsupervised domain adaptation of acoustic models try to fool a specific domain discriminator and learn both senone-discriminative and domain-invariant hidden feature representations. However, a drawback of these approaches is that the feature generator simply aligns different features into the same distribution without considering the class boundaries of the target domain data. Thus, ambiguous target domain features can be generated near the decision boundaries, decreasing speech recognition performance. In this study, we propose to use Adversarial Dropout Regularization (ADR) in acoustic modeling to overcome the foregoing issue. Specifically, we optimize the senone classifier to make its decision boundaries lie in the class boundaries of unlabeled target data. Then, the feature generator learns to create features far away from the decision boundaries, which are more discriminative. We apply the ADR approach on the CHiME-3 corpus and the proposed method yields up to 12.9% relative WER reductions compared with the baseline trained on source domain data only and further improvement over the widely used gradient reversal layer method.

源语言	英语
页（从-至）	749-753
页数	5
期刊	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
卷	2019-September
DOI	https://doi.org/10.21437/Interspeech.2019-2544
出版状态	已出版 - 2019
活动	20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019 - Graz, 奥地利期限: 15 9月 2019 → 19 9月 2019

访问文件

10.21437/Interspeech.2019-2544

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{ccbe4bf208e44bd3bd0efb58c7937b5b,

title = "Unsupervised adaptation with adversarial dropout regularization for robust speech recognition",

abstract = "Recent adversarial methods proposed for unsupervised domain adaptation of acoustic models try to fool a specific domain discriminator and learn both senone-discriminative and domain-invariant hidden feature representations. However, a drawback of these approaches is that the feature generator simply aligns different features into the same distribution without considering the class boundaries of the target domain data. Thus, ambiguous target domain features can be generated near the decision boundaries, decreasing speech recognition performance. In this study, we propose to use Adversarial Dropout Regularization (ADR) in acoustic modeling to overcome the foregoing issue. Specifically, we optimize the senone classifier to make its decision boundaries lie in the class boundaries of unlabeled target data. Then, the feature generator learns to create features far away from the decision boundaries, which are more discriminative. We apply the ADR approach on the CHiME-3 corpus and the proposed method yields up to 12.9% relative WER reductions compared with the baseline trained on source domain data only and further improvement over the widely used gradient reversal layer method.",

keywords = "Adversarial training, Domain adaptation, Robust speech recognition",

author = "Pengcheng Guo and Sining Sun and Lei Xie",

note = "Publisher Copyright: Copyright {\textcopyright} 2019 ISCA; 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019 ; Conference date: 15-09-2019 Through 19-09-2019",

year = "2019",

doi = "10.21437/Interspeech.2019-2544",

language = "英语",

volume = "2019-September",

pages = "749--753",

journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

issn = "2308-457X",

}

TY - JOUR

T1 - Unsupervised adaptation with adversarial dropout regularization for robust speech recognition

AU - Guo, Pengcheng

AU - Sun, Sining

AU - Xie, Lei

PY - 2019

Y1 - 2019

N2 - Recent adversarial methods proposed for unsupervised domain adaptation of acoustic models try to fool a specific domain discriminator and learn both senone-discriminative and domain-invariant hidden feature representations. However, a drawback of these approaches is that the feature generator simply aligns different features into the same distribution without considering the class boundaries of the target domain data. Thus, ambiguous target domain features can be generated near the decision boundaries, decreasing speech recognition performance. In this study, we propose to use Adversarial Dropout Regularization (ADR) in acoustic modeling to overcome the foregoing issue. Specifically, we optimize the senone classifier to make its decision boundaries lie in the class boundaries of unlabeled target data. Then, the feature generator learns to create features far away from the decision boundaries, which are more discriminative. We apply the ADR approach on the CHiME-3 corpus and the proposed method yields up to 12.9% relative WER reductions compared with the baseline trained on source domain data only and further improvement over the widely used gradient reversal layer method.

AB - Recent adversarial methods proposed for unsupervised domain adaptation of acoustic models try to fool a specific domain discriminator and learn both senone-discriminative and domain-invariant hidden feature representations. However, a drawback of these approaches is that the feature generator simply aligns different features into the same distribution without considering the class boundaries of the target domain data. Thus, ambiguous target domain features can be generated near the decision boundaries, decreasing speech recognition performance. In this study, we propose to use Adversarial Dropout Regularization (ADR) in acoustic modeling to overcome the foregoing issue. Specifically, we optimize the senone classifier to make its decision boundaries lie in the class boundaries of unlabeled target data. Then, the feature generator learns to create features far away from the decision boundaries, which are more discriminative. We apply the ADR approach on the CHiME-3 corpus and the proposed method yields up to 12.9% relative WER reductions compared with the baseline trained on source domain data only and further improvement over the widely used gradient reversal layer method.

KW - Adversarial training

KW - Domain adaptation

KW - Robust speech recognition

UR - http://www.scopus.com/inward/record.url?scp=85074733487&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2019-2544

DO - 10.21437/Interspeech.2019-2544

M3 - 会议文章

AN - SCOPUS:85074733487

SN - 2308-457X

VL - 2019-September

SP - 749

EP - 753

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

T2 - 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019

Y2 - 15 September 2019 through 19 September 2019

ER -

Unsupervised adaptation with adversarial dropout regularization for robust speech recognition

摘要

访问文件

其它文件与链接

指纹

引用此