Adversarial Training for Multi-domain Speaker Recognition

Qing Wang; Wei Rao; Pengcheng Guo; Lei Xie

doi:10.1109/ISCSLP49672.2021.9362053

Adversarial Training for Multi-domain Speaker Recognition

Qing Wang, Wei Rao, Pengcheng Guo, Lei Xie

School of Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

11 Scopus citations

Abstract

In real-life applications, the performance of speaker recognition systems always degrades when there is a mismatch between training and evaluation data. Many domain adaptation methods have been successfully used for eliminating the domain mismatches in speaker recognition. However, usually both training and evaluation data themselves can be composed of several subsets. These inner variances of each dataset can also be considered as different domains. Different distributed subsets in source or target domain dataset can also cause multi-domain mismatches, which are influential to speaker recognition performance. In this study, we propose to use adversarial training for multi-domain speaker recognition to solve the domain mismatch and the dataset variance problems. By adopting the proposed method, we are able to obtain both multi-domain-invariant and speaker-discriminative speech representations for speaker recognition. Experimental results on DAC13 dataset indicate that the proposed method is not only effective to solve the multi-domain mismatch problem, but also outperforms the compared unsupervised domain adaptation methods.

Original language	English
Title of host publication	2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9781728169941
DOIs	https://doi.org/10.1109/ISCSLP49672.2021.9362053
State	Published - 24 Jan 2021
Event	12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021 - Hong Kong, Hong Kong Duration: 24 Jan 2021 → 27 Jan 2021

Publication series

Name	2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021

Conference

Conference	12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021
Country/Territory	Hong Kong
City	Hong Kong
Period	24/01/21 → 27/01/21

Keywords

adversarial training
multi-domain adaptation
speaker recognition

Access to Document

10.1109/ISCSLP49672.2021.9362053

Cite this

Wang, Q., Rao, W., Guo, P., & Xie, L. (2021). Adversarial Training for Multi-domain Speaker Recognition. In 2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021 Article 9362053 (2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISCSLP49672.2021.9362053

@inproceedings{74b6d2d5881d484aa37cffeae099961c,

title = "Adversarial Training for Multi-domain Speaker Recognition",

abstract = "In real-life applications, the performance of speaker recognition systems always degrades when there is a mismatch between training and evaluation data. Many domain adaptation methods have been successfully used for eliminating the domain mismatches in speaker recognition. However, usually both training and evaluation data themselves can be composed of several subsets. These inner variances of each dataset can also be considered as different domains. Different distributed subsets in source or target domain dataset can also cause multi-domain mismatches, which are influential to speaker recognition performance. In this study, we propose to use adversarial training for multi-domain speaker recognition to solve the domain mismatch and the dataset variance problems. By adopting the proposed method, we are able to obtain both multi-domain-invariant and speaker-discriminative speech representations for speaker recognition. Experimental results on DAC13 dataset indicate that the proposed method is not only effective to solve the multi-domain mismatch problem, but also outperforms the compared unsupervised domain adaptation methods.",

keywords = "adversarial training, multi-domain adaptation, speaker recognition",

author = "Qing Wang and Wei Rao and Pengcheng Guo and Lei Xie",

note = "Publisher Copyright: {\textcopyright} 2021 IEEE.; 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021 ; Conference date: 24-01-2021 Through 27-01-2021",

year = "2021",

month = jan,

day = "24",

doi = "10.1109/ISCSLP49672.2021.9362053",

language = "英语",

series = "2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021",

}

Wang, Q, Rao, W, Guo, P & Xie, L 2021, Adversarial Training for Multi-domain Speaker Recognition. in 2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021., 9362053, 2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021, Institute of Electrical and Electronics Engineers Inc., 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021, Hong Kong, Hong Kong, 24/01/21. https://doi.org/10.1109/ISCSLP49672.2021.9362053

Adversarial Training for Multi-domain Speaker Recognition. / Wang, Qing; Rao, Wei; Guo, Pengcheng et al.
2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021. Institute of Electrical and Electronics Engineers Inc., 2021. 9362053 (2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Adversarial Training for Multi-domain Speaker Recognition

AU - Wang, Qing

AU - Rao, Wei

AU - Guo, Pengcheng

AU - Xie, Lei

PY - 2021/1/24

Y1 - 2021/1/24

N2 - In real-life applications, the performance of speaker recognition systems always degrades when there is a mismatch between training and evaluation data. Many domain adaptation methods have been successfully used for eliminating the domain mismatches in speaker recognition. However, usually both training and evaluation data themselves can be composed of several subsets. These inner variances of each dataset can also be considered as different domains. Different distributed subsets in source or target domain dataset can also cause multi-domain mismatches, which are influential to speaker recognition performance. In this study, we propose to use adversarial training for multi-domain speaker recognition to solve the domain mismatch and the dataset variance problems. By adopting the proposed method, we are able to obtain both multi-domain-invariant and speaker-discriminative speech representations for speaker recognition. Experimental results on DAC13 dataset indicate that the proposed method is not only effective to solve the multi-domain mismatch problem, but also outperforms the compared unsupervised domain adaptation methods.

AB - In real-life applications, the performance of speaker recognition systems always degrades when there is a mismatch between training and evaluation data. Many domain adaptation methods have been successfully used for eliminating the domain mismatches in speaker recognition. However, usually both training and evaluation data themselves can be composed of several subsets. These inner variances of each dataset can also be considered as different domains. Different distributed subsets in source or target domain dataset can also cause multi-domain mismatches, which are influential to speaker recognition performance. In this study, we propose to use adversarial training for multi-domain speaker recognition to solve the domain mismatch and the dataset variance problems. By adopting the proposed method, we are able to obtain both multi-domain-invariant and speaker-discriminative speech representations for speaker recognition. Experimental results on DAC13 dataset indicate that the proposed method is not only effective to solve the multi-domain mismatch problem, but also outperforms the compared unsupervised domain adaptation methods.

KW - adversarial training

KW - multi-domain adaptation

KW - speaker recognition

UR - http://www.scopus.com/inward/record.url?scp=85102572596&partnerID=8YFLogxK

U2 - 10.1109/ISCSLP49672.2021.9362053

DO - 10.1109/ISCSLP49672.2021.9362053

M3 - 会议稿件

AN - SCOPUS:85102572596

T3 - 2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021

BT - 2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021

Y2 - 24 January 2021 through 27 January 2021

ER -

Wang Q, Rao W, Guo P, Xie L. Adversarial Training for Multi-domain Speaker Recognition. In 2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021. Institute of Electrical and Electronics Engineers Inc. 2021. 9362053. (2021 12th International Symposium on Chinese Spoken Language Processing, ISCSLP 2021). doi: 10.1109/ISCSLP49672.2021.9362053

Adversarial Training for Multi-domain Speaker Recognition

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this