TY - GEN
T1 - Bilinear Semi-Tensor Product Attention (BSTPA) model for visual question answering
AU - Bai, Zongwen
AU - Li, Ying
AU - Zhou, Meili
AU - Li, Di
AU - Wang, Dong
AU - Polap, Dawid
AU - Wozniak, Marcin
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/7
Y1 - 2020/7
N2 - We propose a semi-tensor product attention network model as a visual question answering tool for complex interaction over image features. Proposed model performs matrix multiplication of two arbitrary dimensions, which is used to overcome possible dimensional limitations and improve recognition flexibility. In used block-wise operation we preserve spatial and temporal information but reduce the number of parameters by using low-rank pooling scheme. Applied BERT pre-train model is tuned to recognize question features. The proposed model is evaluated on the VQA2.0 dataset. Research results show that our model has good accuracy and easy reconfiguration for future research.
AB - We propose a semi-tensor product attention network model as a visual question answering tool for complex interaction over image features. Proposed model performs matrix multiplication of two arbitrary dimensions, which is used to overcome possible dimensional limitations and improve recognition flexibility. In used block-wise operation we preserve spatial and temporal information but reduce the number of parameters by using low-rank pooling scheme. Applied BERT pre-train model is tuned to recognize question features. The proposed model is evaluated on the VQA2.0 dataset. Research results show that our model has good accuracy and easy reconfiguration for future research.
KW - bidirectional encoder representation from transformers
KW - multimodal feature fusion
KW - semi-tensor product attention
KW - visual question answer
UR - http://www.scopus.com/inward/record.url?scp=85093875306&partnerID=8YFLogxK
U2 - 10.1109/IJCNN48605.2020.9206964
DO - 10.1109/IJCNN48605.2020.9206964
M3 - 会议稿件
AN - SCOPUS:85093875306
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2020 International Joint Conference on Neural Networks, IJCNN 2020 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 International Joint Conference on Neural Networks, IJCNN 2020
Y2 - 19 July 2020 through 24 July 2020
ER -