Training augmentation with adversarial examples for robust speech recognition

Sining Sun; Ching Feng Yeh; Mari Ostendorf; Mei Yuh Hwang; Lei Xie

doi:10.21437/Interspeech.2018-1247

Training augmentation with adversarial examples for robust speech recognition

Sining Sun, Ching Feng Yeh, Mari Ostendorf, Mei Yuh Hwang, Lei Xie

School of Computer Science

Research output: Contribution to journal › Conference article › peer-review

51 Scopus citations

Abstract

This paper explores the use of adversarial examples in training speech recognition systems to increase robustness of deep neural network acoustic models. During training, the fast gradient sign method is used to generate adversarial examples augmenting the original training data. Different from conventional data augmentation based on data transformations, the examples are dynamically generated based on current acoustic model parameters. We assess the impact of adversarial data augmentation in experiments on the Aurora-4 and CHiME-4 single-channel tasks, showing improved robustness against noise and channel variation. Further improvement is obtained when combining adversarial examples with teacher/student training, leading to a 23% relative word error rate reduction on Aurora-4.

Original language	English
Pages (from-to)	2404-2408
Number of pages	5
Journal	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume	2018-September
DOIs	https://doi.org/10.21437/Interspeech.2018-1247
State	Published - 2018
Event	19th Annual Conference of the International Speech Communication, INTERSPEECH 2018 - Hyderabad, India Duration: 2 Sep 2018 → 6 Sep 2018

Keywords

Adversarial examples
Data augmentation
FGSM
Robust speech recognition
Teacher-student model

Access to Document

10.21437/Interspeech.2018-1247

Cite this

@article{98857f9beda14b1faaddd4b2f5a380cc,

title = "Training augmentation with adversarial examples for robust speech recognition",

abstract = "This paper explores the use of adversarial examples in training speech recognition systems to increase robustness of deep neural network acoustic models. During training, the fast gradient sign method is used to generate adversarial examples augmenting the original training data. Different from conventional data augmentation based on data transformations, the examples are dynamically generated based on current acoustic model parameters. We assess the impact of adversarial data augmentation in experiments on the Aurora-4 and CHiME-4 single-channel tasks, showing improved robustness against noise and channel variation. Further improvement is obtained when combining adversarial examples with teacher/student training, leading to a 23% relative word error rate reduction on Aurora-4.",

keywords = "Adversarial examples, Data augmentation, FGSM, Robust speech recognition, Teacher-student model",

author = "Sining Sun and Yeh, {Ching Feng} and Mari Ostendorf and Hwang, {Mei Yuh} and Lei Xie",

note = "Publisher Copyright: {\textcopyright} 2018 International Speech Communication Association. All rights reserved.; 19th Annual Conference of the International Speech Communication, INTERSPEECH 2018 ; Conference date: 02-09-2018 Through 06-09-2018",

year = "2018",

doi = "10.21437/Interspeech.2018-1247",

language = "英语",

volume = "2018-September",

pages = "2404--2408",

journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

issn = "2308-457X",

}

TY - JOUR

T1 - Training augmentation with adversarial examples for robust speech recognition

AU - Sun, Sining

AU - Yeh, Ching Feng

AU - Ostendorf, Mari

AU - Hwang, Mei Yuh

AU - Xie, Lei

PY - 2018

Y1 - 2018

N2 - This paper explores the use of adversarial examples in training speech recognition systems to increase robustness of deep neural network acoustic models. During training, the fast gradient sign method is used to generate adversarial examples augmenting the original training data. Different from conventional data augmentation based on data transformations, the examples are dynamically generated based on current acoustic model parameters. We assess the impact of adversarial data augmentation in experiments on the Aurora-4 and CHiME-4 single-channel tasks, showing improved robustness against noise and channel variation. Further improvement is obtained when combining adversarial examples with teacher/student training, leading to a 23% relative word error rate reduction on Aurora-4.

AB - This paper explores the use of adversarial examples in training speech recognition systems to increase robustness of deep neural network acoustic models. During training, the fast gradient sign method is used to generate adversarial examples augmenting the original training data. Different from conventional data augmentation based on data transformations, the examples are dynamically generated based on current acoustic model parameters. We assess the impact of adversarial data augmentation in experiments on the Aurora-4 and CHiME-4 single-channel tasks, showing improved robustness against noise and channel variation. Further improvement is obtained when combining adversarial examples with teacher/student training, leading to a 23% relative word error rate reduction on Aurora-4.

KW - Adversarial examples

KW - Data augmentation

KW - FGSM

KW - Robust speech recognition

KW - Teacher-student model

UR - http://www.scopus.com/inward/record.url?scp=85054984962&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2018-1247

DO - 10.21437/Interspeech.2018-1247

M3 - 会议文章

AN - SCOPUS:85054984962

SN - 2308-457X

VL - 2018-September

SP - 2404

EP - 2408

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

T2 - 19th Annual Conference of the International Speech Communication, INTERSPEECH 2018

Y2 - 2 September 2018 through 6 September 2018

ER -

Training augmentation with adversarial examples for robust speech recognition

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this