T-Person-GAN: Text-to-Person image generation with identity-consistency and manifold mix-up

Deyin Liu, Lin Yuanbo Wu, Bo Li, Ye Zhao, Zongyuan Ge, Jian Zhang

科研成果: 期刊稿件文章同行评审

摘要

In this paper, we introduce an end-to-end solution for generating high-resolution person images based solely on textual descriptions. While text-to-image models have made great strides in generating images of objects like flowers and birds, creating person images presents a unique set of challenges: 1) Identity Consistency: For the same person, it's crucial that the generated images exhibit visual details that maintain identity consistency. This means that features like identity-related textures, clothing, and even footwear should be consistent across different images of the same person. 2) Discriminative Power: The generated person images need to be robust in the face of inter-person variations caused by visual ambiguities. To tackle these challenges, we propose a generative model that leverages two novel mechanisms: 1) T-Person-GAN-ID: This mechanism integrates a one-stream generator with an identity-preserving network. It regularizes the representations of generated data in their feature space to ensure identity-consistency. This ensures that images of the same person maintain their unique identity-related features. 2) T-Person-GAN-ID-MM: Manifold mix-up is introduced to create mixed images, which involves linear interpolation between generated images from different manifold identities. We further enforce these interpolated images to be linearly classified in the feature space, essentially learning a linear classification boundary that can perfectly separate images from two distinct identities. The proposed method demonstrates a significant improvement in the challenging task of generating person images from text descriptions. We achieve impressive results with a Fre´chet Inception Distance of 47.81, an Inception Score of 3.96, and a Visual-Semantic Similarity of 0.21 on the benchmark dataset.

源语言英语
文章编号128178
期刊Expert Systems with Applications
288
DOI
出版状态已出版 - 1 9月 2025

指纹

探究 'T-Person-GAN: Text-to-Person image generation with identity-consistency and manifold mix-up' 的科研主题。它们共同构成独一无二的指纹。

引用此