Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition

Qing Wang; Wei Rao; Sining Sun; Leib Xie; Eng Siong Chng; Haizhou Li

doi:10.1109/ICASSP.2018.8461423

Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition

Qing Wang, Wei Rao, Sining Sun, Leib Xie, Eng Siong Chng, Haizhou Li

School of Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

132 Scopus citations

Abstract

The i-vector approach to speaker recognition has achieved good performance when the domain of the evaluation dataset is similar to that of the training dataset. However, in realworld applications, there is always a mismatch between the training and evaluation datasets, that leads to performance degradation. To address this problem, this paper proposes to learn the domain-invariant and speaker-discriminative speech representations via domain adversarial training. Specifically, with domain adversarial training method, we use a gradient reversal layer to remove the domain variation and project the different domain data into the same subspace. Moreover, we compare the proposed method with other state-of-the-art unsupervised domain adaptation techniques for i-vector approach to speaker recognition (e.g. autoencoder based domain adaptation, inter dataset variability compensation, dataset-invariant covariance normalization, and so on). Experiments on 2013 domain adaptation challenge (DAC) dataset demonstrate that the proposed method is not only effective in solving the dataset mismatch problem, but also outperforms the compared unsupervised domain adaptation methods.

Original language	English
Title of host publication	2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	4889-4893
Number of pages	5
ISBN (Print)	9781538646588
DOIs	https://doi.org/10.1109/ICASSP.2018.8461423
State	Published - 10 Sep 2018
Event	2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Calgary, Canada Duration: 15 Apr 2018 → 20 Apr 2018

Publication series

Name	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume	2018-April
ISSN (Print)	1520-6149

Conference

Conference	2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018
Country/Territory	Canada
City	Calgary
Period	15/04/18 → 20/04/18

Keywords

Domain Adversarial Training
Speaker Recognition
Unsupervised Domain Adaptation

Access to Document

10.1109/ICASSP.2018.8461423

Cite this

Wang, Q., Rao, W., Sun, S., Xie, L., Chng, E. S., & Li, H. (2018). Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition. In 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings (pp. 4889-4893). Article 8461423 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2018-April). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2018.8461423

Wang, Qing ; Rao, Wei ; Sun, Sining et al. / Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition. 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 4889-4893 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

@inproceedings{69d70aa293fa4467b9635d6d06d5b80e,

title = "Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition",

abstract = "The i-vector approach to speaker recognition has achieved good performance when the domain of the evaluation dataset is similar to that of the training dataset. However, in realworld applications, there is always a mismatch between the training and evaluation datasets, that leads to performance degradation. To address this problem, this paper proposes to learn the domain-invariant and speaker-discriminative speech representations via domain adversarial training. Specifically, with domain adversarial training method, we use a gradient reversal layer to remove the domain variation and project the different domain data into the same subspace. Moreover, we compare the proposed method with other state-of-the-art unsupervised domain adaptation techniques for i-vector approach to speaker recognition (e.g. autoencoder based domain adaptation, inter dataset variability compensation, dataset-invariant covariance normalization, and so on). Experiments on 2013 domain adaptation challenge (DAC) dataset demonstrate that the proposed method is not only effective in solving the dataset mismatch problem, but also outperforms the compared unsupervised domain adaptation methods.",

keywords = "Domain Adversarial Training, Speaker Recognition, Unsupervised Domain Adaptation",

author = "Qing Wang and Wei Rao and Sining Sun and Leib Xie and Chng, {Eng Siong} and Haizhou Li",

note = "Publisher Copyright: {\textcopyright} 2018 IEEE.; 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 ; Conference date: 15-04-2018 Through 20-04-2018",

year = "2018",

month = sep,

day = "10",

doi = "10.1109/ICASSP.2018.8461423",

language = "英语",

isbn = "9781538646588",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "4889--4893",

booktitle = "2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings",

}

Wang, Q, Rao, W, Sun, S, Xie, L, Chng, ES & Li, H 2018, Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition. in 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings., 8461423, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2018-April, Institute of Electrical and Electronics Engineers Inc., pp. 4889-4893, 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018, Calgary, Canada, 15/04/18. https://doi.org/10.1109/ICASSP.2018.8461423

Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition. / Wang, Qing; Rao, Wei; Sun, Sining et al.
2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2018. p. 4889-4893 8461423 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2018-April).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition

AU - Wang, Qing

AU - Rao, Wei

AU - Sun, Sining

AU - Xie, Leib

AU - Chng, Eng Siong

AU - Li, Haizhou

PY - 2018/9/10

Y1 - 2018/9/10

N2 - The i-vector approach to speaker recognition has achieved good performance when the domain of the evaluation dataset is similar to that of the training dataset. However, in realworld applications, there is always a mismatch between the training and evaluation datasets, that leads to performance degradation. To address this problem, this paper proposes to learn the domain-invariant and speaker-discriminative speech representations via domain adversarial training. Specifically, with domain adversarial training method, we use a gradient reversal layer to remove the domain variation and project the different domain data into the same subspace. Moreover, we compare the proposed method with other state-of-the-art unsupervised domain adaptation techniques for i-vector approach to speaker recognition (e.g. autoencoder based domain adaptation, inter dataset variability compensation, dataset-invariant covariance normalization, and so on). Experiments on 2013 domain adaptation challenge (DAC) dataset demonstrate that the proposed method is not only effective in solving the dataset mismatch problem, but also outperforms the compared unsupervised domain adaptation methods.

AB - The i-vector approach to speaker recognition has achieved good performance when the domain of the evaluation dataset is similar to that of the training dataset. However, in realworld applications, there is always a mismatch between the training and evaluation datasets, that leads to performance degradation. To address this problem, this paper proposes to learn the domain-invariant and speaker-discriminative speech representations via domain adversarial training. Specifically, with domain adversarial training method, we use a gradient reversal layer to remove the domain variation and project the different domain data into the same subspace. Moreover, we compare the proposed method with other state-of-the-art unsupervised domain adaptation techniques for i-vector approach to speaker recognition (e.g. autoencoder based domain adaptation, inter dataset variability compensation, dataset-invariant covariance normalization, and so on). Experiments on 2013 domain adaptation challenge (DAC) dataset demonstrate that the proposed method is not only effective in solving the dataset mismatch problem, but also outperforms the compared unsupervised domain adaptation methods.

KW - Domain Adversarial Training

KW - Speaker Recognition

KW - Unsupervised Domain Adaptation

UR - http://www.scopus.com/inward/record.url?scp=85054135361&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2018.8461423

DO - 10.1109/ICASSP.2018.8461423

M3 - 会议稿件

AN - SCOPUS:85054135361

SN - 9781538646588

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 4889

EP - 4893

BT - 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018

Y2 - 15 April 2018 through 20 April 2018

ER -

Wang Q, Rao W, Sun S, Xie L, Chng ES, Li H. Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition. In 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2018. p. 4889-4893. 8461423. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). doi: 10.1109/ICASSP.2018.8461423

Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this