Adversarial regularization for end-to-end robust speaker verification

Qing Wang, Pengcheng Guo, Sining Sun, Lei Xie, John H.L. Hansen

Research output: Contribution to journalConference articlepeer-review

43 Scopus citations

Abstract

Deep learning has been successfully used in speaker verification (SV), especially in end-to-end SV systems which have attracted more interest recently. It has been shown in image as well as speech applications that deep neural networks are vulnerable to adversarial examples. In this study, we explore two methods to generate adversarial examples for advanced SV: (i) fast gradient-sign method (FGSM), and (ii) local distributional smoothness (LDS) method. To explore this issue, we use adversarial examples to attack an end-to-end SV system. Experiments will show that the neural network can be easily disturbed by adversarial examples. Next, we propose to train an end-to-end robust SV model using the two proposed adversarial examples for model regularization. Experimental results with the TIMIT dataset indicate that the EER is improved relatively by (i) +18.89% and (ii) +5.54% for the original test set using the regularized model. In addition, the regularized model improves EER of the adversarial example test set by a relative (i) +30.11% and (ii) +22.12%, which therefore suggests more consistent performance against adversarial example attacks.

Keywords

  • Adversarial example
  • Adversarial regularization
  • End-to-end robust SV
  • Fast gradient-sign method (FGSM)
  • Local distributional smoothness (LDS)

Fingerprint

Dive into the research topics of 'Adversarial regularization for end-to-end robust speaker verification'. Together they form a unique fingerprint.

Cite this