TY - JOUR
T1 - Spatiotemporal fusion personality prediction based on visual information
AU - Xu, Jia
AU - Tian, Weijian
AU - Lv, Guoyun
AU - Fan, Yangyu
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2023/11
Y1 - 2023/11
N2 - The previous studies have demonstrated that the use of deep learning algorithms can make personality prediction based on two-dimensional image information, and the emergence of video provides more possibilities for exploring personality prediction. Compared to image-based personality prediction, using video can provide more information than static images. But videos contain hundreds of frames, not all of which are useful, and processing these images requires a lot of computation. This paper proposes to apply video analysis algorithms to the task of personality prediction and propose the use of LSTM to fuse image feature information. The best prediction effect is confirmed by experiments when the fusion frame number is 16 frames. This paper is based on 3D-ConvNet to build an end-to-end video analysis network and solve the network over fitting problem by pre-training and data augmentation. Experiments show that the accuracy of character prediction can be improved by using 3D-ConvNet to fuse the spatio-temporal information of videos.
AB - The previous studies have demonstrated that the use of deep learning algorithms can make personality prediction based on two-dimensional image information, and the emergence of video provides more possibilities for exploring personality prediction. Compared to image-based personality prediction, using video can provide more information than static images. But videos contain hundreds of frames, not all of which are useful, and processing these images requires a lot of computation. This paper proposes to apply video analysis algorithms to the task of personality prediction and propose the use of LSTM to fuse image feature information. The best prediction effect is confirmed by experiments when the fusion frame number is 16 frames. This paper is based on 3D-ConvNet to build an end-to-end video analysis network and solve the network over fitting problem by pre-training and data augmentation. Experiments show that the accuracy of character prediction can be improved by using 3D-ConvNet to fuse the spatio-temporal information of videos.
KW - Personality Prediction
KW - Spatiotemporal Fusion
KW - Visual Information
UR - http://www.scopus.com/inward/record.url?scp=85156146592&partnerID=8YFLogxK
U2 - 10.1007/s11042-023-15537-0
DO - 10.1007/s11042-023-15537-0
M3 - 文章
AN - SCOPUS:85156146592
SN - 1380-7501
VL - 82
SP - 44227
EP - 44244
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 28
ER -