TY - JOUR
T1 - Toward Accurate Human Parsing Through Edge Guided Diffusion
AU - Liu, Ting
AU - Zhu, Hongkun
AU - Wei, Yunchao
AU - Wei, Shikui
AU - Zhao, Yao
AU - Zhang, Yanning
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Existing human parsing frameworks commonly employ joint learning of semantic edge detection and human parsing to facilitate the localization around boundary regions. Nevertheless, the parsing prediction within the interior of the part contour may still exhibit inconsistencies due to the inherent ambiguity of fine-grained semantics. In contrast, binary edge detection does not suffer from such fine-grained semantic ambiguity, leading to a typical failure case where misclassification occurs inner the part contour while the semantic edge is accurately detected. To address these challenges, we develop a novel diffusion scheme that incorporates guidance from the detected semantic edge to mitigate this problem by propagating corrected classified semantics into the misclassified regions. Building upon this diffusion scheme, we present an Edge Guided Diffusion Network (EGDNet) for human parsing, which can progressively refine the parsing predictions to enhance the accuracy and coherence of human parsing results. Moreover, we design a horizontal-vertical aggregation to exploit inherent correlations among body parts along both the horizontal and vertical axes, which aims at enhancing the initial parsing results. Extensive experimental evaluations on various challenging datasets demonstrate the effectiveness of the proposed EGDNet. Remarkably, our EGDNet shows impressive performances on six benchmark datasets, including four human body parsing datasets (LIP, CIHP, ATR, and PASCAL-Person-Part), and two human face parsing datasets (CelebAMask-HQ and LaPa).
AB - Existing human parsing frameworks commonly employ joint learning of semantic edge detection and human parsing to facilitate the localization around boundary regions. Nevertheless, the parsing prediction within the interior of the part contour may still exhibit inconsistencies due to the inherent ambiguity of fine-grained semantics. In contrast, binary edge detection does not suffer from such fine-grained semantic ambiguity, leading to a typical failure case where misclassification occurs inner the part contour while the semantic edge is accurately detected. To address these challenges, we develop a novel diffusion scheme that incorporates guidance from the detected semantic edge to mitigate this problem by propagating corrected classified semantics into the misclassified regions. Building upon this diffusion scheme, we present an Edge Guided Diffusion Network (EGDNet) for human parsing, which can progressively refine the parsing predictions to enhance the accuracy and coherence of human parsing results. Moreover, we design a horizontal-vertical aggregation to exploit inherent correlations among body parts along both the horizontal and vertical axes, which aims at enhancing the initial parsing results. Extensive experimental evaluations on various challenging datasets demonstrate the effectiveness of the proposed EGDNet. Remarkably, our EGDNet shows impressive performances on six benchmark datasets, including four human body parsing datasets (LIP, CIHP, ATR, and PASCAL-Person-Part), and two human face parsing datasets (CelebAMask-HQ and LaPa).
KW - edge detection
KW - face parsing
KW - Human parsing
KW - semantic segmentation
UR - http://www.scopus.com/inward/record.url?scp=85189330358&partnerID=8YFLogxK
U2 - 10.1109/TIP.2024.3379931
DO - 10.1109/TIP.2024.3379931
M3 - 文章
C2 - 38530730
AN - SCOPUS:85189330358
SN - 1057-7149
VL - 33
SP - 2530
EP - 2543
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
ER -