Inaudible adversarial perturbations for targeted attack in speaker recognition

Qing Wang; Pengcheng Guo; Lei Xie

doi:10.21437/Interspeech.2020-1955

Inaudible adversarial perturbations for targeted attack in speaker recognition

Qing Wang, Pengcheng Guo, Lei Xie

计算机学院

Northwestern Polytechnical University Xian

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

31 引用（Scopus）

摘要

Speaker recognition is a popular topic in biometric authentication and many deep learning approaches have achieved extraordinary performances. However, it has been shown in both image and speech applications that deep neural networks are vulnerable to adversarial examples. In this study, we aim to exploit this weakness to perform targeted adversarial attacks against the x-vector based speaker recognition system. We propose to generate inaudible adversarial perturbations based on the psychoacoustic principle of frequency masking, achieving targeted white-box attacks to speaker recognition system. Specifically, we constrict the perturbation under the masking threshold of original audio, instead of using a common lp norm to measure the perturbations. Experiments on Aishell-1 corpus show that our approach yields up to 98.5% attack success rate to arbitrary gender speaker targets, while retaining indistinguishable attribute to listeners. Furthermore, we also achieve an effective speaker attack when applying the proposed approach to a completely irrelevant waveform, such as music.

源语言	英语
主期刊名	Interspeech 2020
出版商	International Speech Communication Association
页	4228-4232
页数	5
ISBN（印刷版）	9781713820697
DOI	https://doi.org/10.21437/Interspeech.2020-1955
出版状态	已出版 - 2020
活动	21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 - Shanghai, 中国期限: 25 10月 2020 → 29 10月 2020

出版系列

姓名	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
卷	2020-October
ISSN（印刷版）	2308-457X
ISSN（电子版）	1990-9772

会议

会议	21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020
国家/地区	中国
市	Shanghai
时期	25/10/20 → 29/10/20

访问文件

10.21437/Interspeech.2020-1955

其它文件与链接

链接到 Scopus 的出版物

引用此

Wang, Q., Guo, P., & Xie, L. (2020). Inaudible adversarial perturbations for targeted attack in speaker recognition. 在 Interspeech 2020 (页码 4228-4232). (Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH; 卷 2020-October). International Speech Communication Association. https://doi.org/10.21437/Interspeech.2020-1955

@inproceedings{551d8e254c124f77b09ef3d46feaed25,

title = "Inaudible adversarial perturbations for targeted attack in speaker recognition",

abstract = "Speaker recognition is a popular topic in biometric authentication and many deep learning approaches have achieved extraordinary performances. However, it has been shown in both image and speech applications that deep neural networks are vulnerable to adversarial examples. In this study, we aim to exploit this weakness to perform targeted adversarial attacks against the x-vector based speaker recognition system. We propose to generate inaudible adversarial perturbations based on the psychoacoustic principle of frequency masking, achieving targeted white-box attacks to speaker recognition system. Specifically, we constrict the perturbation under the masking threshold of original audio, instead of using a common lp norm to measure the perturbations. Experiments on Aishell-1 corpus show that our approach yields up to 98.5% attack success rate to arbitrary gender speaker targets, while retaining indistinguishable attribute to listeners. Furthermore, we also achieve an effective speaker attack when applying the proposed approach to a completely irrelevant waveform, such as music.",

keywords = "Adversarial example, Inaudible, Speaker recognition, Targeted adversarial attack",

author = "Qing Wang and Pengcheng Guo and Lei Xie",

note = "Publisher Copyright: Copyright {\textcopyright} 2020 ISCA; 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 ; Conference date: 25-10-2020 Through 29-10-2020",

year = "2020",

doi = "10.21437/Interspeech.2020-1955",

language = "英语",

isbn = "9781713820697",

series = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

publisher = "International Speech Communication Association",

pages = "4228--4232",

booktitle = "Interspeech 2020",

}

Wang, Q, Guo, P & Xie, L 2020, Inaudible adversarial perturbations for targeted attack in speaker recognition. 在 Interspeech 2020. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 卷 2020-October, International Speech Communication Association, 页码 4228-4232, 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020, Shanghai, 中国, 25/10/20. https://doi.org/10.21437/Interspeech.2020-1955

Inaudible adversarial perturbations for targeted attack in speaker recognition. / Wang, Qing; Guo, Pengcheng; Xie, Lei.
Interspeech 2020. International Speech Communication Association, 2020. 页码 4228-4232 (Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH; 卷 2020-October).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Inaudible adversarial perturbations for targeted attack in speaker recognition

AU - Wang, Qing

AU - Guo, Pengcheng

AU - Xie, Lei

PY - 2020

Y1 - 2020

N2 - Speaker recognition is a popular topic in biometric authentication and many deep learning approaches have achieved extraordinary performances. However, it has been shown in both image and speech applications that deep neural networks are vulnerable to adversarial examples. In this study, we aim to exploit this weakness to perform targeted adversarial attacks against the x-vector based speaker recognition system. We propose to generate inaudible adversarial perturbations based on the psychoacoustic principle of frequency masking, achieving targeted white-box attacks to speaker recognition system. Specifically, we constrict the perturbation under the masking threshold of original audio, instead of using a common lp norm to measure the perturbations. Experiments on Aishell-1 corpus show that our approach yields up to 98.5% attack success rate to arbitrary gender speaker targets, while retaining indistinguishable attribute to listeners. Furthermore, we also achieve an effective speaker attack when applying the proposed approach to a completely irrelevant waveform, such as music.

AB - Speaker recognition is a popular topic in biometric authentication and many deep learning approaches have achieved extraordinary performances. However, it has been shown in both image and speech applications that deep neural networks are vulnerable to adversarial examples. In this study, we aim to exploit this weakness to perform targeted adversarial attacks against the x-vector based speaker recognition system. We propose to generate inaudible adversarial perturbations based on the psychoacoustic principle of frequency masking, achieving targeted white-box attacks to speaker recognition system. Specifically, we constrict the perturbation under the masking threshold of original audio, instead of using a common lp norm to measure the perturbations. Experiments on Aishell-1 corpus show that our approach yields up to 98.5% attack success rate to arbitrary gender speaker targets, while retaining indistinguishable attribute to listeners. Furthermore, we also achieve an effective speaker attack when applying the proposed approach to a completely irrelevant waveform, such as music.

KW - Adversarial example

KW - Inaudible

KW - Speaker recognition

KW - Targeted adversarial attack

UR - http://www.scopus.com/inward/record.url?scp=85098192009&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2020-1955

DO - 10.21437/Interspeech.2020-1955

M3 - 会议稿件

AN - SCOPUS:85098192009

SN - 9781713820697

T3 - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

SP - 4228

EP - 4232

BT - Interspeech 2020

PB - International Speech Communication Association

T2 - 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020

Y2 - 25 October 2020 through 29 October 2020

ER -

Inaudible adversarial perturbations for targeted attack in speaker recognition

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此