NPU speaker verification system for interspeech 2020 far-field speaker verification challenge

Li Zhang, Jian Wu, Lei Xie

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

This paper describes the NPU system submitted to Interspeech 2020 Far-Field Speaker Verification Challenge (FFSVC). We particularly focus on far-field text-dependent SV from single (task1) and multiple microphone arrays (task3). The major challenges in such scenarios are short utterance and cross-channel and distance mismatch for enrollment and test. With the belief that better speaker embedding can alleviate the effects from short utterance, we introduce a new speaker embedding architecture - ResNet-BAM, which integrates a bottleneck attention module with ResNet as a simple and efficient way to further improve representation power of ResNet. This contribution brings up to 1% EER reduction. We further address the mismatch problem in three directions. First, domain adversarial training, which aims to learn domain-invariant features, can yield to 0.8% EER reduction. Second, front-end signal processing, including WPE and beamforming, has no obvious contribution, but together with data selection and domain adversarial training, can further contribute to 0.5% EER reduction. Finally, data augmentation, which works with a specifically-designed data selection strategy, can lead to 2% EER reduction. Together with the above contributions, in the middle challenge results, our single submission system (without multi-system fusion) achieves the first and second place on task 1 and task 3, respectively.

Original languageEnglish
Title of host publicationInterspeech 2020
PublisherInternational Speech Communication Association
Pages3471-3475
Number of pages5
ISBN (Print)9781713820697
DOIs
StatePublished - 2020
Event21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 - Shanghai, China
Duration: 25 Oct 202029 Oct 2020

Publication series

NameProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2020-October
ISSN (Print)2308-457X
ISSN (Electronic)1990-9772

Conference

Conference21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020
Country/TerritoryChina
CityShanghai
Period25/10/2029/10/20

Keywords

  • Data augmentation
  • Domain adversarial training
  • Far-field
  • Speaker verification

Fingerprint

Dive into the research topics of 'NPU speaker verification system for interspeech 2020 far-field speaker verification challenge'. Together they form a unique fingerprint.

Cite this