Toward Accurate Human Parsing Through Edge Guided Diffusion

Ting Liu; Hongkun Zhu; Yunchao Wei; Shikui Wei; Yao Zhao; Yanning Zhang

doi:10.1109/TIP.2024.3379931

Toward Accurate Human Parsing Through Edge Guided Diffusion

Ting Liu, Hongkun Zhu, Yunchao Wei, Shikui Wei, Yao Zhao, Yanning Zhang

School of Computer Science

Research output: Contribution to journal › Article › peer-review

6 Scopus citations

Abstract

Existing human parsing frameworks commonly employ joint learning of semantic edge detection and human parsing to facilitate the localization around boundary regions. Nevertheless, the parsing prediction within the interior of the part contour may still exhibit inconsistencies due to the inherent ambiguity of fine-grained semantics. In contrast, binary edge detection does not suffer from such fine-grained semantic ambiguity, leading to a typical failure case where misclassification occurs inner the part contour while the semantic edge is accurately detected. To address these challenges, we develop a novel diffusion scheme that incorporates guidance from the detected semantic edge to mitigate this problem by propagating corrected classified semantics into the misclassified regions. Building upon this diffusion scheme, we present an Edge Guided Diffusion Network (EGDNet) for human parsing, which can progressively refine the parsing predictions to enhance the accuracy and coherence of human parsing results. Moreover, we design a horizontal-vertical aggregation to exploit inherent correlations among body parts along both the horizontal and vertical axes, which aims at enhancing the initial parsing results. Extensive experimental evaluations on various challenging datasets demonstrate the effectiveness of the proposed EGDNet. Remarkably, our EGDNet shows impressive performances on six benchmark datasets, including four human body parsing datasets (LIP, CIHP, ATR, and PASCAL-Person-Part), and two human face parsing datasets (CelebAMask-HQ and LaPa).

Original language	English
Pages (from-to)	2530-2543
Number of pages	14
Journal	IEEE Transactions on Image Processing
Volume	33
DOIs	https://doi.org/10.1109/TIP.2024.3379931
State	Published - 2024

Keywords

edge detection
face parsing
Human parsing
semantic segmentation

Access to Document

10.1109/TIP.2024.3379931

Cite this

@article{dcaf24f20ec24e4eab4c3aafa64c7792,

title = "Toward Accurate Human Parsing Through Edge Guided Diffusion",

abstract = "Existing human parsing frameworks commonly employ joint learning of semantic edge detection and human parsing to facilitate the localization around boundary regions. Nevertheless, the parsing prediction within the interior of the part contour may still exhibit inconsistencies due to the inherent ambiguity of fine-grained semantics. In contrast, binary edge detection does not suffer from such fine-grained semantic ambiguity, leading to a typical failure case where misclassification occurs inner the part contour while the semantic edge is accurately detected. To address these challenges, we develop a novel diffusion scheme that incorporates guidance from the detected semantic edge to mitigate this problem by propagating corrected classified semantics into the misclassified regions. Building upon this diffusion scheme, we present an Edge Guided Diffusion Network (EGDNet) for human parsing, which can progressively refine the parsing predictions to enhance the accuracy and coherence of human parsing results. Moreover, we design a horizontal-vertical aggregation to exploit inherent correlations among body parts along both the horizontal and vertical axes, which aims at enhancing the initial parsing results. Extensive experimental evaluations on various challenging datasets demonstrate the effectiveness of the proposed EGDNet. Remarkably, our EGDNet shows impressive performances on six benchmark datasets, including four human body parsing datasets (LIP, CIHP, ATR, and PASCAL-Person-Part), and two human face parsing datasets (CelebAMask-HQ and LaPa).",

keywords = "edge detection, face parsing, Human parsing, semantic segmentation",

author = "Ting Liu and Hongkun Zhu and Yunchao Wei and Shikui Wei and Yao Zhao and Yanning Zhang",

note = "Publisher Copyright: {\textcopyright} 1992-2012 IEEE.",

year = "2024",

doi = "10.1109/TIP.2024.3379931",

language = "英语",

volume = "33",

pages = "2530--2543",

journal = "IEEE Transactions on Image Processing",

issn = "1057-7149",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Toward Accurate Human Parsing Through Edge Guided Diffusion

AU - Liu, Ting

AU - Zhu, Hongkun

AU - Wei, Yunchao

AU - Wei, Shikui

AU - Zhao, Yao

AU - Zhang, Yanning

PY - 2024

Y1 - 2024

N2 - Existing human parsing frameworks commonly employ joint learning of semantic edge detection and human parsing to facilitate the localization around boundary regions. Nevertheless, the parsing prediction within the interior of the part contour may still exhibit inconsistencies due to the inherent ambiguity of fine-grained semantics. In contrast, binary edge detection does not suffer from such fine-grained semantic ambiguity, leading to a typical failure case where misclassification occurs inner the part contour while the semantic edge is accurately detected. To address these challenges, we develop a novel diffusion scheme that incorporates guidance from the detected semantic edge to mitigate this problem by propagating corrected classified semantics into the misclassified regions. Building upon this diffusion scheme, we present an Edge Guided Diffusion Network (EGDNet) for human parsing, which can progressively refine the parsing predictions to enhance the accuracy and coherence of human parsing results. Moreover, we design a horizontal-vertical aggregation to exploit inherent correlations among body parts along both the horizontal and vertical axes, which aims at enhancing the initial parsing results. Extensive experimental evaluations on various challenging datasets demonstrate the effectiveness of the proposed EGDNet. Remarkably, our EGDNet shows impressive performances on six benchmark datasets, including four human body parsing datasets (LIP, CIHP, ATR, and PASCAL-Person-Part), and two human face parsing datasets (CelebAMask-HQ and LaPa).

AB - Existing human parsing frameworks commonly employ joint learning of semantic edge detection and human parsing to facilitate the localization around boundary regions. Nevertheless, the parsing prediction within the interior of the part contour may still exhibit inconsistencies due to the inherent ambiguity of fine-grained semantics. In contrast, binary edge detection does not suffer from such fine-grained semantic ambiguity, leading to a typical failure case where misclassification occurs inner the part contour while the semantic edge is accurately detected. To address these challenges, we develop a novel diffusion scheme that incorporates guidance from the detected semantic edge to mitigate this problem by propagating corrected classified semantics into the misclassified regions. Building upon this diffusion scheme, we present an Edge Guided Diffusion Network (EGDNet) for human parsing, which can progressively refine the parsing predictions to enhance the accuracy and coherence of human parsing results. Moreover, we design a horizontal-vertical aggregation to exploit inherent correlations among body parts along both the horizontal and vertical axes, which aims at enhancing the initial parsing results. Extensive experimental evaluations on various challenging datasets demonstrate the effectiveness of the proposed EGDNet. Remarkably, our EGDNet shows impressive performances on six benchmark datasets, including four human body parsing datasets (LIP, CIHP, ATR, and PASCAL-Person-Part), and two human face parsing datasets (CelebAMask-HQ and LaPa).

KW - edge detection

KW - face parsing

KW - Human parsing

KW - semantic segmentation

UR - http://www.scopus.com/inward/record.url?scp=85189330358&partnerID=8YFLogxK

U2 - 10.1109/TIP.2024.3379931

DO - 10.1109/TIP.2024.3379931

M3 - 文章

C2 - 38530730

AN - SCOPUS:85189330358

SN - 1057-7149

VL - 33

SP - 2530

EP - 2543

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

ER -

Toward Accurate Human Parsing Through Edge Guided Diffusion

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this