Deception Detection Algorithm Based on Global and Local Feature Fusion with Multi-head Attention

Jian Kang; Wen Qu; Shaoxing Cui; Xiaoyi Feng

doi:10.1109/ICIPMC62364.2024.10586666

Deception Detection Algorithm Based on Global and Local Feature Fusion with Multi-head Attention

Jian Kang, Wen Qu, Shaoxing Cui, Xiaoyi Feng

School of Electronics and Information

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

2 Scopus citations

Abstract

Deception is a prevalent human behavior that significantly impacts our perception of essential facts. Therefore, developing accurate deception detection technology holds great significance. However, current research on pure visual deception detection algorithms does not leverage deep learning methods to extract detailed features such as facial Action Units (AUs) and Gaze Angles. Additionally, the global information within facial video sequences is often overlooked. To address these limitations, this paper introduces a novel deception detection model that combines global and local facial features through attention mechanisms. Firstly, the model focuses on the local features of the face, computing AU Strength and Gaze Angle for each frame to create a multivariate time series for every video. Subsequently, the Siamese Transformer model, employing Patching, extracts deep temporal and channel features from the multivariate time series. Additionally, the occurrence frequency of five specific AUs is selected as a manual feature. Secondly, the model conducts video understanding based on the global features of the face. Local features are extracted from each frame using Shallow CNNs with multiple sensitivity fields. Then, a Video Transformer model with spatiotemporal separation attention is applied to globally model the sequence of face frames. Finally, the extracted local and global facial features are concatenated and fed into a classifier to determine deception. Extensive experiments on existing datasets validate the outstanding performance of the proposed method.

Original language	English
Title of host publication	2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	162-168
Number of pages	7
ISBN (Electronic)	9798350386660
DOIs	https://doi.org/10.1109/ICIPMC62364.2024.10586666
State	Published - 2024
Event	3rd International Conference on Image Processing and Media Computing, ICIPMC 2024 - Hefei, China Duration: 17 May 2024 → 19 May 2024

Publication series

Name	2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024

Conference

Conference	3rd International Conference on Image Processing and Media Computing, ICIPMC 2024
Country/Territory	China
City	Hefei
Period	17/05/24 → 19/05/24

Keywords

Deception detection
Facial AU
Multivariate time series
Transformer
Video Understanding

Access to Document

10.1109/ICIPMC62364.2024.10586666

Cite this

Kang, J., Qu, W., Cui, S., & Feng, X. (2024). Deception Detection Algorithm Based on Global and Local Feature Fusion with Multi-head Attention. In 2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024 (pp. 162-168). (2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICIPMC62364.2024.10586666

Kang, Jian ; Qu, Wen ; Cui, Shaoxing et al. / Deception Detection Algorithm Based on Global and Local Feature Fusion with Multi-head Attention. 2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024. Institute of Electrical and Electronics Engineers Inc., 2024. pp. 162-168 (2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024).

@inproceedings{a400b41245b840e8adae844ee8c2990e,

title = "Deception Detection Algorithm Based on Global and Local Feature Fusion with Multi-head Attention",

abstract = "Deception is a prevalent human behavior that significantly impacts our perception of essential facts. Therefore, developing accurate deception detection technology holds great significance. However, current research on pure visual deception detection algorithms does not leverage deep learning methods to extract detailed features such as facial Action Units (AUs) and Gaze Angles. Additionally, the global information within facial video sequences is often overlooked. To address these limitations, this paper introduces a novel deception detection model that combines global and local facial features through attention mechanisms. Firstly, the model focuses on the local features of the face, computing AU Strength and Gaze Angle for each frame to create a multivariate time series for every video. Subsequently, the Siamese Transformer model, employing Patching, extracts deep temporal and channel features from the multivariate time series. Additionally, the occurrence frequency of five specific AUs is selected as a manual feature. Secondly, the model conducts video understanding based on the global features of the face. Local features are extracted from each frame using Shallow CNNs with multiple sensitivity fields. Then, a Video Transformer model with spatiotemporal separation attention is applied to globally model the sequence of face frames. Finally, the extracted local and global facial features are concatenated and fed into a classifier to determine deception. Extensive experiments on existing datasets validate the outstanding performance of the proposed method.",

keywords = "Deception detection, Facial AU, Multivariate time series, Transformer, Video Understanding",

author = "Jian Kang and Wen Qu and Shaoxing Cui and Xiaoyi Feng",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024 ; Conference date: 17-05-2024 Through 19-05-2024",

year = "2024",

doi = "10.1109/ICIPMC62364.2024.10586666",

language = "英语",

series = "2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "162--168",

booktitle = "2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024",

}

Kang, J, Qu, W, Cui, S & Feng, X 2024, Deception Detection Algorithm Based on Global and Local Feature Fusion with Multi-head Attention. in 2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024. 2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024, Institute of Electrical and Electronics Engineers Inc., pp. 162-168, 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024, Hefei, China, 17/05/24. https://doi.org/10.1109/ICIPMC62364.2024.10586666

Deception Detection Algorithm Based on Global and Local Feature Fusion with Multi-head Attention. / Kang, Jian; Qu, Wen; Cui, Shaoxing et al.
2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024. Institute of Electrical and Electronics Engineers Inc., 2024. p. 162-168 (2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Deception Detection Algorithm Based on Global and Local Feature Fusion with Multi-head Attention

AU - Kang, Jian

AU - Qu, Wen

AU - Cui, Shaoxing

AU - Feng, Xiaoyi

PY - 2024

Y1 - 2024

N2 - Deception is a prevalent human behavior that significantly impacts our perception of essential facts. Therefore, developing accurate deception detection technology holds great significance. However, current research on pure visual deception detection algorithms does not leverage deep learning methods to extract detailed features such as facial Action Units (AUs) and Gaze Angles. Additionally, the global information within facial video sequences is often overlooked. To address these limitations, this paper introduces a novel deception detection model that combines global and local facial features through attention mechanisms. Firstly, the model focuses on the local features of the face, computing AU Strength and Gaze Angle for each frame to create a multivariate time series for every video. Subsequently, the Siamese Transformer model, employing Patching, extracts deep temporal and channel features from the multivariate time series. Additionally, the occurrence frequency of five specific AUs is selected as a manual feature. Secondly, the model conducts video understanding based on the global features of the face. Local features are extracted from each frame using Shallow CNNs with multiple sensitivity fields. Then, a Video Transformer model with spatiotemporal separation attention is applied to globally model the sequence of face frames. Finally, the extracted local and global facial features are concatenated and fed into a classifier to determine deception. Extensive experiments on existing datasets validate the outstanding performance of the proposed method.

AB - Deception is a prevalent human behavior that significantly impacts our perception of essential facts. Therefore, developing accurate deception detection technology holds great significance. However, current research on pure visual deception detection algorithms does not leverage deep learning methods to extract detailed features such as facial Action Units (AUs) and Gaze Angles. Additionally, the global information within facial video sequences is often overlooked. To address these limitations, this paper introduces a novel deception detection model that combines global and local facial features through attention mechanisms. Firstly, the model focuses on the local features of the face, computing AU Strength and Gaze Angle for each frame to create a multivariate time series for every video. Subsequently, the Siamese Transformer model, employing Patching, extracts deep temporal and channel features from the multivariate time series. Additionally, the occurrence frequency of five specific AUs is selected as a manual feature. Secondly, the model conducts video understanding based on the global features of the face. Local features are extracted from each frame using Shallow CNNs with multiple sensitivity fields. Then, a Video Transformer model with spatiotemporal separation attention is applied to globally model the sequence of face frames. Finally, the extracted local and global facial features are concatenated and fed into a classifier to determine deception. Extensive experiments on existing datasets validate the outstanding performance of the proposed method.

KW - Deception detection

KW - Facial AU

KW - Multivariate time series

KW - Transformer

KW - Video Understanding

UR - http://www.scopus.com/inward/record.url?scp=85199474577&partnerID=8YFLogxK

U2 - 10.1109/ICIPMC62364.2024.10586666

DO - 10.1109/ICIPMC62364.2024.10586666

M3 - 会议稿件

AN - SCOPUS:85199474577

T3 - 2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024

SP - 162

EP - 168

BT - 2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024

Y2 - 17 May 2024 through 19 May 2024

ER -

Kang J, Qu W, Cui S, Feng X. Deception Detection Algorithm Based on Global and Local Feature Fusion with Multi-head Attention. In 2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024. Institute of Electrical and Electronics Engineers Inc. 2024. p. 162-168. (2024 3rd International Conference on Image Processing and Media Computing, ICIPMC 2024). doi: 10.1109/ICIPMC62364.2024.10586666

Deception Detection Algorithm Based on Global and Local Feature Fusion with Multi-head Attention

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this