TY - GEN
T1 - ERASING CONCEPT COMBINATIONS FROM TEXT-TO-IMAGE DIFFUSION MODEL
AU - Nie, Hongyi
AU - Yao, Quanming
AU - Liu, Yang
AU - Wang, Zhen
AU - Bian, Yatao
N1 - Publisher Copyright:
© 2025 13th International Conference on Learning Representations, ICLR 2025. All rights reserved.
PY - 2025
Y1 - 2025
N2 - Advancements in the text-to-image diffusion model have raised security concerns due to their potential to generate images with inappropriate themes such as societal biases and copyright infringements. Current studies have made notable progress in preventing the model from generating images containing specific high-risk visual concepts. However, these methods neglect the issue that inappropriate themes may also arise from the combination of benign visual concepts. A crucial challenge arises because the same image theme can be represented through multiple distinct visual concept combinations, and the model's ability to generate individual concepts may become distorted when processing these combinations. Consequently, effectively erasing such visual concept combinations from the diffusion model remains a formidable challenge. To tackle this problem, we formalize the problem as the Concept Combination Erasing (CCE) problem and propose a Concept Graph-based high-level Feature Decoupling framework (COGFD) to address CCE. COGFD identifies and decomposes visual concept combinations with a consistent image theme from an LLM-induced concept logic graph, and erases these combinations through decoupling co-occurrent high-level features. These techniques enable COGFD to eliminate undesirable visual concept combinations while minimizing adverse effects on the generative fidelity of related individual concepts, outperforming state-of-the-art baselines. Extensive experiments across diverse visual concept combination scenarios verify the effectiveness of COGFD.
AB - Advancements in the text-to-image diffusion model have raised security concerns due to their potential to generate images with inappropriate themes such as societal biases and copyright infringements. Current studies have made notable progress in preventing the model from generating images containing specific high-risk visual concepts. However, these methods neglect the issue that inappropriate themes may also arise from the combination of benign visual concepts. A crucial challenge arises because the same image theme can be represented through multiple distinct visual concept combinations, and the model's ability to generate individual concepts may become distorted when processing these combinations. Consequently, effectively erasing such visual concept combinations from the diffusion model remains a formidable challenge. To tackle this problem, we formalize the problem as the Concept Combination Erasing (CCE) problem and propose a Concept Graph-based high-level Feature Decoupling framework (COGFD) to address CCE. COGFD identifies and decomposes visual concept combinations with a consistent image theme from an LLM-induced concept logic graph, and erases these combinations through decoupling co-occurrent high-level features. These techniques enable COGFD to eliminate undesirable visual concept combinations while minimizing adverse effects on the generative fidelity of related individual concepts, outperforming state-of-the-art baselines. Extensive experiments across diverse visual concept combination scenarios verify the effectiveness of COGFD.
UR - https://www.scopus.com/pages/publications/105010285654
M3 - 会议稿件
AN - SCOPUS:105010285654
T3 - 13th International Conference on Learning Representations, ICLR 2025
SP - 63738
EP - 63758
BT - 13th International Conference on Learning Representations, ICLR 2025
PB - International Conference on Learning Representations, ICLR
T2 - 13th International Conference on Learning Representations, ICLR 2025
Y2 - 24 April 2025 through 28 April 2025
ER -