Symmetric Saliency-Based Adversarial Attack to Speaker Identification

Jiadi Yao; Xing Chen; Xiao Lei Zhang; Wei Qiang Zhang; Kunde Yang

doi:10.1109/LSP.2023.3236509

Symmetric Saliency-Based Adversarial Attack to Speaker Identification

Jiadi Yao, Xing Chen, Xiao Lei Zhang, Wei Qiang Zhang, Kunde Yang

Ocean Institute

Research output: Contribution to journal › Article › peer-review

12 Scopus citations

Abstract

Adversarial attack approaches to speaker identification either need high computational cost or are not very effective, to our knowledge. To address this issue, in this letter, we propose a novel generation-network-based approach, called symmetric saliency-based encoder-decoder (SSED), to generate adversarial voice examples to speaker identification. It contains two novel components. First, it uses a novel saliency map decoder to learn the importance of speech samples to the decision of a targeted speaker identification system, so as to make the attacker focus on generating artificial noise to the important samples. It also proposes an angular loss function to push the speaker embedding far away from the source speaker. Our experimental results demonstrate that the proposed SSED yields the state-of-the-art performance, i.e. over 97% targeted attack success rate and a signal-to-noise level of over 39 dB on both the open-set and close-set speaker identification tasks, with a low computational cost.

Original language	English
Pages (from-to)	1-5
Number of pages	5
Journal	IEEE Signal Processing Letters
Volume	30
DOIs	https://doi.org/10.1109/LSP.2023.3236509
State	Published - 2023

Keywords

Adversarial attack
angular loss
saliency map decoder
speaker identification

Access to Document

10.1109/LSP.2023.3236509

Cite this

@article{de7b890653cc414c8ac3d613e85f5f13,

title = "Symmetric Saliency-Based Adversarial Attack to Speaker Identification",

abstract = "Adversarial attack approaches to speaker identification either need high computational cost or are not very effective, to our knowledge. To address this issue, in this letter, we propose a novel generation-network-based approach, called symmetric saliency-based encoder-decoder (SSED), to generate adversarial voice examples to speaker identification. It contains two novel components. First, it uses a novel saliency map decoder to learn the importance of speech samples to the decision of a targeted speaker identification system, so as to make the attacker focus on generating artificial noise to the important samples. It also proposes an angular loss function to push the speaker embedding far away from the source speaker. Our experimental results demonstrate that the proposed SSED yields the state-of-the-art performance, i.e. over 97% targeted attack success rate and a signal-to-noise level of over 39 dB on both the open-set and close-set speaker identification tasks, with a low computational cost.",

keywords = "Adversarial attack, angular loss, saliency map decoder, speaker identification",

author = "Jiadi Yao and Xing Chen and Zhang, {Xiao Lei} and Zhang, {Wei Qiang} and Kunde Yang",

note = "Publisher Copyright: {\textcopyright} 1994-2012 IEEE.",

year = "2023",

doi = "10.1109/LSP.2023.3236509",

language = "英语",

volume = "30",

pages = "1--5",

journal = "IEEE Signal Processing Letters",

issn = "1070-9908",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Symmetric Saliency-Based Adversarial Attack to Speaker Identification

AU - Yao, Jiadi

AU - Chen, Xing

AU - Zhang, Xiao Lei

AU - Zhang, Wei Qiang

AU - Yang, Kunde

PY - 2023

Y1 - 2023

N2 - Adversarial attack approaches to speaker identification either need high computational cost or are not very effective, to our knowledge. To address this issue, in this letter, we propose a novel generation-network-based approach, called symmetric saliency-based encoder-decoder (SSED), to generate adversarial voice examples to speaker identification. It contains two novel components. First, it uses a novel saliency map decoder to learn the importance of speech samples to the decision of a targeted speaker identification system, so as to make the attacker focus on generating artificial noise to the important samples. It also proposes an angular loss function to push the speaker embedding far away from the source speaker. Our experimental results demonstrate that the proposed SSED yields the state-of-the-art performance, i.e. over 97% targeted attack success rate and a signal-to-noise level of over 39 dB on both the open-set and close-set speaker identification tasks, with a low computational cost.

AB - Adversarial attack approaches to speaker identification either need high computational cost or are not very effective, to our knowledge. To address this issue, in this letter, we propose a novel generation-network-based approach, called symmetric saliency-based encoder-decoder (SSED), to generate adversarial voice examples to speaker identification. It contains two novel components. First, it uses a novel saliency map decoder to learn the importance of speech samples to the decision of a targeted speaker identification system, so as to make the attacker focus on generating artificial noise to the important samples. It also proposes an angular loss function to push the speaker embedding far away from the source speaker. Our experimental results demonstrate that the proposed SSED yields the state-of-the-art performance, i.e. over 97% targeted attack success rate and a signal-to-noise level of over 39 dB on both the open-set and close-set speaker identification tasks, with a low computational cost.

KW - Adversarial attack

KW - angular loss

KW - saliency map decoder

KW - speaker identification

UR - http://www.scopus.com/inward/record.url?scp=85147281225&partnerID=8YFLogxK

U2 - 10.1109/LSP.2023.3236509

DO - 10.1109/LSP.2023.3236509

M3 - 文章

AN - SCOPUS:85147281225

SN - 1070-9908

VL - 30

SP - 1

EP - 5

JO - IEEE Signal Processing Letters

JF - IEEE Signal Processing Letters

ER -

Symmetric Saliency-Based Adversarial Attack to Speaker Identification

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this