TY - JOUR
T1 - SAST
T2 - a suppressing ambiguity self-training framework for facial expression recognition
AU - Guo, Zhe
AU - Wei, Bingxin
AU - Liu, Xuewen
AU - Zhang, Zhibo
AU - Liu, Shiya
AU - Fan, Yangyu
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023.
PY - 2024/5
Y1 - 2024/5
N2 - Facial expression recognition (FER) suffers from insufficient label information, as human expressions are complex and diverse, with many expressions ambiguous. Using low-quality labels or low-quantity labels will aggravate ambiguity of model predictions and reduce the accuracy of FER. How to improve the robustness of FER to ambiguous data with insufficient information remains challenging. To this end, we propose the Suppressing Ambiguity Self-Training (SAST) framework which is the first attempt to address the problem of insufficient information both label quality and label quantity containing, simultaneously. Specifically, we design an Ambiguous Relative Label Usage (ARLU) strategy that mixes hard labels and soft labels to alleviate the information loss problem caused by hard labels. We also enhance the robustness of the model to ambiguous data by means of Self-Training Resampling (STR). We further use the landmarks and Patch Branch (PB) to enhance the ability of suppressing ambiguity. Experiments on RAF-DB, FERPlus, SFEW, and AffectNet datasets show that our SAST outperforms 6 semi-supervised methods with fewer annotations, and achieves competitive accuracy to State-Of-The-Art (SOTA) FER methods. Our code is available at https://github.com/Liuxww/SAST.
AB - Facial expression recognition (FER) suffers from insufficient label information, as human expressions are complex and diverse, with many expressions ambiguous. Using low-quality labels or low-quantity labels will aggravate ambiguity of model predictions and reduce the accuracy of FER. How to improve the robustness of FER to ambiguous data with insufficient information remains challenging. To this end, we propose the Suppressing Ambiguity Self-Training (SAST) framework which is the first attempt to address the problem of insufficient information both label quality and label quantity containing, simultaneously. Specifically, we design an Ambiguous Relative Label Usage (ARLU) strategy that mixes hard labels and soft labels to alleviate the information loss problem caused by hard labels. We also enhance the robustness of the model to ambiguous data by means of Self-Training Resampling (STR). We further use the landmarks and Patch Branch (PB) to enhance the ability of suppressing ambiguity. Experiments on RAF-DB, FERPlus, SFEW, and AffectNet datasets show that our SAST outperforms 6 semi-supervised methods with fewer annotations, and achieves competitive accuracy to State-Of-The-Art (SOTA) FER methods. Our code is available at https://github.com/Liuxww/SAST.
KW - Facial expression recognition
KW - Insufficient information
KW - Self-training
KW - Suppressing ambiguity
UR - http://www.scopus.com/inward/record.url?scp=85178909235&partnerID=8YFLogxK
U2 - 10.1007/s11042-023-17749-w
DO - 10.1007/s11042-023-17749-w
M3 - 文章
AN - SCOPUS:85178909235
SN - 1380-7501
VL - 83
SP - 56059
EP - 56076
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 18
ER -