Micro-expression spotting with multi-scale local transformer in long videos

Xupeng Guo, Xiaobiao Zhang, Lei Li, Zhaoqiang Xia

科研成果: 期刊稿件文章同行评审

19 引用 (Scopus)

摘要

Micro-expression analysis by computer vision techniques has attracted much attention as it can reveal the human emotions automatically. Among the analysis tasks, the temporal spotting is the most challenging task for achieving expression-aware frames from long video sequences. Compared to the well studied recognition task, more researches need to be devoted to the spotting task for further improving the performance and benefiting the subsequent tasks. So, in this paper, we propose a convolutional transformer based deep model for micro-expression spotting in long video sequences. A 3D convolutional subnetwork is firstly employed to extract the visual features from the temporal frames in a fixed-size sliding window of original video sequence. Then a multi-scale local transformer module is designed based on the visual features to model the correlation between frames in a local window. By leveraging the correlation information, the description of face movement becomes more representative for various-duration micro-expressions. Finally, the multi-head classifier and the corresponding estimator are jointly combined to predict the temporal position for spotting micro-expressions. The proposed method is evaluated on two publicly-available datasets, namely CAS(ME)2 and SAMM-LV, and achieves the promising performance of 0.2770 F1-score on SAMM-LV and 0.1373 F1-score on CAS(ME)2. The code is publicly available on GitHub (https://github.com/xiazhaoqiang/MULT-MicroExpressionSpot).

源语言英语
页(从-至)146-152
页数7
期刊Pattern Recognition Letters
168
DOI
出版状态已出版 - 4月 2023

指纹

探究 'Micro-expression spotting with multi-scale local transformer in long videos' 的科研主题。它们共同构成独一无二的指纹。

引用此