TY - JOUR
T1 - A Survey of Multimodal Fake News Detection
T2 - A Cross-Modal Interaction Perspective
AU - Li, Xianghua
AU - Qiao, Jiao
AU - Yin, Shu
AU - Wu, Lianwei
AU - Gao, Chao
AU - Wang, Zhen
AU - Li, Xuelong
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2025
Y1 - 2025
N2 - The growth of social media platforms has made it easier for fake news to spread, which poses a significant threat to authoritative news outlets, politics, and public health. Manual verification of the massive amount of online information has proven to be a daunting task, which has led to the growing interest in automatic fake news detection. Some methods that rely on news text, images, external knowledge, social contexts, or propagation graphs have demonstrated good performance. In contrast to earlier studies that focused solely on the unimodal news textual information, recent works have integrated multimodal features from various granularities, such as words, visual semantic regions, and multimodal entities, to more effectively leverage news content and align with human reading habits. However, a comprehensive review of Multimodal Fake News Detection (MFND) is still lacking, prompting our aim to complement this topic. Specifically, we present a systematic taxonomy from the perspective of cross-modal interactions. We categorize existing methods into the data-based, entity-based, and knowledge-based approaches. Connections between various works are detailed when outlining representative papers. Additionally, we introduce prevalent multimodal learning methods, present accessible MFND datasets and evaluation metrics, and analyze current research results. Finally, the promising future research directions are discussed.
AB - The growth of social media platforms has made it easier for fake news to spread, which poses a significant threat to authoritative news outlets, politics, and public health. Manual verification of the massive amount of online information has proven to be a daunting task, which has led to the growing interest in automatic fake news detection. Some methods that rely on news text, images, external knowledge, social contexts, or propagation graphs have demonstrated good performance. In contrast to earlier studies that focused solely on the unimodal news textual information, recent works have integrated multimodal features from various granularities, such as words, visual semantic regions, and multimodal entities, to more effectively leverage news content and align with human reading habits. However, a comprehensive review of Multimodal Fake News Detection (MFND) is still lacking, prompting our aim to complement this topic. Specifically, we present a systematic taxonomy from the perspective of cross-modal interactions. We categorize existing methods into the data-based, entity-based, and knowledge-based approaches. Connections between various works are detailed when outlining representative papers. Additionally, we introduce prevalent multimodal learning methods, present accessible MFND datasets and evaluation metrics, and analyze current research results. Finally, the promising future research directions are discussed.
KW - Social networks
KW - multimodal fake news detection
KW - multimodal rumor detection
UR - http://www.scopus.com/inward/record.url?scp=105002688031&partnerID=8YFLogxK
U2 - 10.1109/TETCI.2025.3543389
DO - 10.1109/TETCI.2025.3543389
M3 - 文章
AN - SCOPUS:105002688031
SN - 2471-285X
JO - IEEE Transactions on Emerging Topics in Computational Intelligence
JF - IEEE Transactions on Emerging Topics in Computational Intelligence
ER -