A backdoor attack method based on target feature enhanced generative network

Changfei Zhao; Tao Xiao; Xinyang Deng; Wen Jiang

doi:10.1016/j.ins.2024.121776

A backdoor attack method based on target feature enhanced generative network

Changfei Zhao, Tao Xiao, Xinyang Deng, Wen Jiang

电子信息学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Backdoor attacks hinder the on-the-ground applications of neural networks. Attacks under the comprehensive privilege threat model have the privilege of accessing both data and models, posing serious security risks. However, existing attacks make inadequate use of the attacked model, making it difficult to guarantee the attack performance and robustness of the generated triggers. In this paper, we propose a backdoor attack method based on the target feature enhanced generative network. Specifically, we utilize the gradients of the attacked model on features of clean samples to weigh the features of the target class samples and introduce them into the decoder of the generative network to enhance the diversity and stealthiness of triggers. Besides, we design a three-phase backdoor model generation strategy to guarantee the validity of features fed into the encoder and the adaptability of the backdoor model to the generated triggers. Sufficient experiments on mainstream datasets and models demonstrate that the proposed method can achieve superior attack performance compared to the baselines, especially in stringent settings with low poisoning rates, and the trigger noise is also concealed. In addition, in the face of the mainstream backdoor defenses, the proposed method shows superior robustness and can still maintain satisfactory attack performance.

源语言	英语
文章编号	121776
期刊	Information Sciences
卷	698
DOI	https://doi.org/10.1016/j.ins.2024.121776
出版状态	已出版 - 4月 2025

访问文件

10.1016/j.ins.2024.121776

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{c77fb90a7c9d4a41827f647f9411e8c3,

title = "A backdoor attack method based on target feature enhanced generative network",

abstract = "Backdoor attacks hinder the on-the-ground applications of neural networks. Attacks under the comprehensive privilege threat model have the privilege of accessing both data and models, posing serious security risks. However, existing attacks make inadequate use of the attacked model, making it difficult to guarantee the attack performance and robustness of the generated triggers. In this paper, we propose a backdoor attack method based on the target feature enhanced generative network. Specifically, we utilize the gradients of the attacked model on features of clean samples to weigh the features of the target class samples and introduce them into the decoder of the generative network to enhance the diversity and stealthiness of triggers. Besides, we design a three-phase backdoor model generation strategy to guarantee the validity of features fed into the encoder and the adaptability of the backdoor model to the generated triggers. Sufficient experiments on mainstream datasets and models demonstrate that the proposed method can achieve superior attack performance compared to the baselines, especially in stringent settings with low poisoning rates, and the trigger noise is also concealed. In addition, in the face of the mainstream backdoor defenses, the proposed method shows superior robustness and can still maintain satisfactory attack performance.",

keywords = "Backdoor attack, Generative network, Gradient, Hidden layer feature",

author = "Changfei Zhao and Tao Xiao and Xinyang Deng and Wen Jiang",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier Inc.",

year = "2025",

month = apr,

doi = "10.1016/j.ins.2024.121776",

language = "英语",

volume = "698",

journal = "Information Sciences",

issn = "0020-0255",

publisher = "Elsevier Inc.",

}

TY - JOUR

T1 - A backdoor attack method based on target feature enhanced generative network

AU - Zhao, Changfei

AU - Xiao, Tao

AU - Deng, Xinyang

AU - Jiang, Wen

PY - 2025/4

Y1 - 2025/4

N2 - Backdoor attacks hinder the on-the-ground applications of neural networks. Attacks under the comprehensive privilege threat model have the privilege of accessing both data and models, posing serious security risks. However, existing attacks make inadequate use of the attacked model, making it difficult to guarantee the attack performance and robustness of the generated triggers. In this paper, we propose a backdoor attack method based on the target feature enhanced generative network. Specifically, we utilize the gradients of the attacked model on features of clean samples to weigh the features of the target class samples and introduce them into the decoder of the generative network to enhance the diversity and stealthiness of triggers. Besides, we design a three-phase backdoor model generation strategy to guarantee the validity of features fed into the encoder and the adaptability of the backdoor model to the generated triggers. Sufficient experiments on mainstream datasets and models demonstrate that the proposed method can achieve superior attack performance compared to the baselines, especially in stringent settings with low poisoning rates, and the trigger noise is also concealed. In addition, in the face of the mainstream backdoor defenses, the proposed method shows superior robustness and can still maintain satisfactory attack performance.

AB - Backdoor attacks hinder the on-the-ground applications of neural networks. Attacks under the comprehensive privilege threat model have the privilege of accessing both data and models, posing serious security risks. However, existing attacks make inadequate use of the attacked model, making it difficult to guarantee the attack performance and robustness of the generated triggers. In this paper, we propose a backdoor attack method based on the target feature enhanced generative network. Specifically, we utilize the gradients of the attacked model on features of clean samples to weigh the features of the target class samples and introduce them into the decoder of the generative network to enhance the diversity and stealthiness of triggers. Besides, we design a three-phase backdoor model generation strategy to guarantee the validity of features fed into the encoder and the adaptability of the backdoor model to the generated triggers. Sufficient experiments on mainstream datasets and models demonstrate that the proposed method can achieve superior attack performance compared to the baselines, especially in stringent settings with low poisoning rates, and the trigger noise is also concealed. In addition, in the face of the mainstream backdoor defenses, the proposed method shows superior robustness and can still maintain satisfactory attack performance.

KW - Backdoor attack

KW - Generative network

KW - Gradient

KW - Hidden layer feature

UR - http://www.scopus.com/inward/record.url?scp=85212567603&partnerID=8YFLogxK

U2 - 10.1016/j.ins.2024.121776

DO - 10.1016/j.ins.2024.121776

M3 - 文章

AN - SCOPUS:85212567603

SN - 0020-0255

VL - 698

JO - Information Sciences

JF - Information Sciences

M1 - 121776

ER -

A backdoor attack method based on target feature enhanced generative network

摘要

访问文件

其它文件与链接

指纹

引用此