Backend Ensemble for Speaker Verification and Spoofing Countermeasure

Li Zhang; Yue Li; Huan Zhao; Qing Wang; Lei Xie

doi:10.21437/Interspeech.2022-10259

Backend Ensemble for Speaker Verification and Spoofing Countermeasure

Li Zhang, Yue Li, Huan Zhao, Qing Wang, Lei Xie

计算机学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 会议文章 › 同行评审

8 引用（Scopus）

摘要

This paper describes the NPU system submitted to Spoofing Aware Speaker Verification Challenge 2022. We particularly focus on the backend ensemble for speaker verification and spoofing countermeasure from three aspects. Firstly, besides simple concatenation, we propose circulant matrix transformation and stacking for speaker embeddings and countermeasure embeddings. With the stacking operation of newly-defined circulant embeddings, we almost explore all the possible interactions between speaker embeddings and countermeasure embeddings. Secondly, we attempt different convolution neural networks to selectively fuse the embeddings' salient regions into channels with convolution kernels. Finally, we design parallel attention in 1D convolution neural networks to learn the global correlation in channel dimensions as well as to learn the important parts in feature dimensions. Meanwhile, we embed squeeze-and-excitation attention in 2D convolutional neural networks to learn the global dependence among speaker embeddings and countermeasure embeddings. Experimental results demonstrate that all the above methods are effective. After fusion of four well-trained models enhanced by the mentioned methods, the best SASV-EER, SPF-EER and SV-EER we achieve are 0.559%, 0.354% and 0.857% on the evaluation set respectively. Together with the above contributions, our submission system achieves the fifth place in this challenge.

源语言	英语
页（从-至）	4381-4385
页数	5
期刊	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
卷	2022-September
DOI	https://doi.org/10.21437/Interspeech.2022-10259
出版状态	已出版 - 2022
活动	23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022 - Incheon, 韩国期限: 18 9月 2022 → 22 9月 2022

访问文件

10.21437/Interspeech.2022-10259

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{30b68f34e86747f8ac76f3ea138024bb,

title = "Backend Ensemble for Speaker Verification and Spoofing Countermeasure",

abstract = "This paper describes the NPU system submitted to Spoofing Aware Speaker Verification Challenge 2022. We particularly focus on the backend ensemble for speaker verification and spoofing countermeasure from three aspects. Firstly, besides simple concatenation, we propose circulant matrix transformation and stacking for speaker embeddings and countermeasure embeddings. With the stacking operation of newly-defined circulant embeddings, we almost explore all the possible interactions between speaker embeddings and countermeasure embeddings. Secondly, we attempt different convolution neural networks to selectively fuse the embeddings' salient regions into channels with convolution kernels. Finally, we design parallel attention in 1D convolution neural networks to learn the global correlation in channel dimensions as well as to learn the important parts in feature dimensions. Meanwhile, we embed squeeze-and-excitation attention in 2D convolutional neural networks to learn the global dependence among speaker embeddings and countermeasure embeddings. Experimental results demonstrate that all the above methods are effective. After fusion of four well-trained models enhanced by the mentioned methods, the best SASV-EER, SPF-EER and SV-EER we achieve are 0.559%, 0.354% and 0.857% on the evaluation set respectively. Together with the above contributions, our submission system achieves the fifth place in this challenge.",

keywords = "backend ensemble, speaker verification, spoofing countermeasure",

author = "Li Zhang and Yue Li and Huan Zhao and Qing Wang and Lei Xie",

note = "Publisher Copyright: Copyright {\textcopyright} 2022 ISCA.; 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022 ; Conference date: 18-09-2022 Through 22-09-2022",

year = "2022",

doi = "10.21437/Interspeech.2022-10259",

language = "英语",

volume = "2022-September",

pages = "4381--4385",

journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

issn = "2308-457X",

}

TY - JOUR

T1 - Backend Ensemble for Speaker Verification and Spoofing Countermeasure

AU - Zhang, Li

AU - Li, Yue

AU - Zhao, Huan

AU - Wang, Qing

AU - Xie, Lei

PY - 2022

Y1 - 2022

N2 - This paper describes the NPU system submitted to Spoofing Aware Speaker Verification Challenge 2022. We particularly focus on the backend ensemble for speaker verification and spoofing countermeasure from three aspects. Firstly, besides simple concatenation, we propose circulant matrix transformation and stacking for speaker embeddings and countermeasure embeddings. With the stacking operation of newly-defined circulant embeddings, we almost explore all the possible interactions between speaker embeddings and countermeasure embeddings. Secondly, we attempt different convolution neural networks to selectively fuse the embeddings' salient regions into channels with convolution kernels. Finally, we design parallel attention in 1D convolution neural networks to learn the global correlation in channel dimensions as well as to learn the important parts in feature dimensions. Meanwhile, we embed squeeze-and-excitation attention in 2D convolutional neural networks to learn the global dependence among speaker embeddings and countermeasure embeddings. Experimental results demonstrate that all the above methods are effective. After fusion of four well-trained models enhanced by the mentioned methods, the best SASV-EER, SPF-EER and SV-EER we achieve are 0.559%, 0.354% and 0.857% on the evaluation set respectively. Together with the above contributions, our submission system achieves the fifth place in this challenge.

AB - This paper describes the NPU system submitted to Spoofing Aware Speaker Verification Challenge 2022. We particularly focus on the backend ensemble for speaker verification and spoofing countermeasure from three aspects. Firstly, besides simple concatenation, we propose circulant matrix transformation and stacking for speaker embeddings and countermeasure embeddings. With the stacking operation of newly-defined circulant embeddings, we almost explore all the possible interactions between speaker embeddings and countermeasure embeddings. Secondly, we attempt different convolution neural networks to selectively fuse the embeddings' salient regions into channels with convolution kernels. Finally, we design parallel attention in 1D convolution neural networks to learn the global correlation in channel dimensions as well as to learn the important parts in feature dimensions. Meanwhile, we embed squeeze-and-excitation attention in 2D convolutional neural networks to learn the global dependence among speaker embeddings and countermeasure embeddings. Experimental results demonstrate that all the above methods are effective. After fusion of four well-trained models enhanced by the mentioned methods, the best SASV-EER, SPF-EER and SV-EER we achieve are 0.559%, 0.354% and 0.857% on the evaluation set respectively. Together with the above contributions, our submission system achieves the fifth place in this challenge.

KW - backend ensemble

KW - speaker verification

KW - spoofing countermeasure

UR - http://www.scopus.com/inward/record.url?scp=85128605176&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2022-10259

DO - 10.21437/Interspeech.2022-10259

M3 - 会议文章

AN - SCOPUS:85128605176

SN - 2308-457X

VL - 2022-September

SP - 4381

EP - 4385

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

T2 - 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022

Y2 - 18 September 2022 through 22 September 2022

ER -

Backend Ensemble for Speaker Verification and Spoofing Countermeasure

摘要

访问文件

其它文件与链接

指纹

引用此