Face De-Occlusion With Deep Cascade Guidance Learning

Ni Zhang; Nian Liu; Junwei Han; Kaiyuan Wan; Ling Shao

doi:10.1109/TMM.2022.3157036

Face De-Occlusion With Deep Cascade Guidance Learning

Ni Zhang, Nian Liu, Junwei Han, Kaiyuan Wan, Ling Shao

自动化学院

科研成果: 期刊稿件 › 文章 › 同行评审

10 引用（Scopus）

摘要

Occlusion is a challenging yet commonly seen problem for facial perception. Existing works resort to deep learning models and perform model training on synthesized data due to the lack of paired real-world data. As a result,they usually perform unsatisfactorily on real-world occluded faces because of domain gaps. In this paper, we decompose the face de-occlusion task into three stages, i.e., occlusion detection, face parsing, and face reconstruction, to alleviate this issue. We first perform occlusion detection and use its results as guidance for the second stage to conduct occlusion-free face parsing. As such, face de-occlusion is first performed on the face paring space with less difficulty. We can train these two stages on both synthesized and real-world images, hence can obtain accurate results for the latter. In the last stage, we use the domain-agnostic occlusion detection map and the face parsing map as the guidance to conduct face reconstruction, thus can reduce the impact of appearance information and improve the model performance on real-world data. Aiming at improving the model capacity of inferring occluded facial appearance, we also propose two types of reference modules to use relevant facial parts to enhance the reconstruction of occluded regions. Consequently, our proposed model achieves promising face de-occlusion results on real-world images.

源语言	英语
页（从-至）	3217-3229
页数	13
期刊	IEEE Transactions on Multimedia
卷	25
DOI	https://doi.org/10.1109/TMM.2022.3157036
出版状态	已出版 - 2023

访问文件

10.1109/TMM.2022.3157036

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{b758368766d746029cfab6aa265cd75d,

title = "Face De-Occlusion With Deep Cascade Guidance Learning",

abstract = "Occlusion is a challenging yet commonly seen problem for facial perception. Existing works resort to deep learning models and perform model training on synthesized data due to the lack of paired real-world data. As a result,they usually perform unsatisfactorily on real-world occluded faces because of domain gaps. In this paper, we decompose the face de-occlusion task into three stages, i.e., occlusion detection, face parsing, and face reconstruction, to alleviate this issue. We first perform occlusion detection and use its results as guidance for the second stage to conduct occlusion-free face parsing. As such, face de-occlusion is first performed on the face paring space with less difficulty. We can train these two stages on both synthesized and real-world images, hence can obtain accurate results for the latter. In the last stage, we use the domain-agnostic occlusion detection map and the face parsing map as the guidance to conduct face reconstruction, thus can reduce the impact of appearance information and improve the model performance on real-world data. Aiming at improving the model capacity of inferring occluded facial appearance, we also propose two types of reference modules to use relevant facial parts to enhance the reconstruction of occluded regions. Consequently, our proposed model achieves promising face de-occlusion results on real-world images.",

keywords = "Face de-occlusion, face inpainting, face parsing, GAN",

author = "Ni Zhang and Nian Liu and Junwei Han and Kaiyuan Wan and Ling Shao",

note = "Publisher Copyright: {\textcopyright} 1999-2012 IEEE.",

year = "2023",

doi = "10.1109/TMM.2022.3157036",

language = "英语",

volume = "25",

pages = "3217--3229",

journal = "IEEE Transactions on Multimedia",

issn = "1520-9210",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Face De-Occlusion With Deep Cascade Guidance Learning

AU - Zhang, Ni

AU - Liu, Nian

AU - Han, Junwei

AU - Wan, Kaiyuan

AU - Shao, Ling

PY - 2023

Y1 - 2023

N2 - Occlusion is a challenging yet commonly seen problem for facial perception. Existing works resort to deep learning models and perform model training on synthesized data due to the lack of paired real-world data. As a result,they usually perform unsatisfactorily on real-world occluded faces because of domain gaps. In this paper, we decompose the face de-occlusion task into three stages, i.e., occlusion detection, face parsing, and face reconstruction, to alleviate this issue. We first perform occlusion detection and use its results as guidance for the second stage to conduct occlusion-free face parsing. As such, face de-occlusion is first performed on the face paring space with less difficulty. We can train these two stages on both synthesized and real-world images, hence can obtain accurate results for the latter. In the last stage, we use the domain-agnostic occlusion detection map and the face parsing map as the guidance to conduct face reconstruction, thus can reduce the impact of appearance information and improve the model performance on real-world data. Aiming at improving the model capacity of inferring occluded facial appearance, we also propose two types of reference modules to use relevant facial parts to enhance the reconstruction of occluded regions. Consequently, our proposed model achieves promising face de-occlusion results on real-world images.

AB - Occlusion is a challenging yet commonly seen problem for facial perception. Existing works resort to deep learning models and perform model training on synthesized data due to the lack of paired real-world data. As a result,they usually perform unsatisfactorily on real-world occluded faces because of domain gaps. In this paper, we decompose the face de-occlusion task into three stages, i.e., occlusion detection, face parsing, and face reconstruction, to alleviate this issue. We first perform occlusion detection and use its results as guidance for the second stage to conduct occlusion-free face parsing. As such, face de-occlusion is first performed on the face paring space with less difficulty. We can train these two stages on both synthesized and real-world images, hence can obtain accurate results for the latter. In the last stage, we use the domain-agnostic occlusion detection map and the face parsing map as the guidance to conduct face reconstruction, thus can reduce the impact of appearance information and improve the model performance on real-world data. Aiming at improving the model capacity of inferring occluded facial appearance, we also propose two types of reference modules to use relevant facial parts to enhance the reconstruction of occluded regions. Consequently, our proposed model achieves promising face de-occlusion results on real-world images.

KW - Face de-occlusion

KW - face inpainting

KW - face parsing

KW - GAN

UR - http://www.scopus.com/inward/record.url?scp=85126331605&partnerID=8YFLogxK

U2 - 10.1109/TMM.2022.3157036

DO - 10.1109/TMM.2022.3157036

M3 - 文章

AN - SCOPUS:85126331605

SN - 1520-9210

VL - 25

SP - 3217

EP - 3229

JO - IEEE Transactions on Multimedia

JF - IEEE Transactions on Multimedia

ER -

Face De-Occlusion With Deep Cascade Guidance Learning

摘要

访问文件

其它文件与链接

指纹

引用此