Improving multimodal fake news detection by leveraging cross-modal content correlation

Jiao Qiao; Xianghua Li; Chao Gao; Lianwei Wu; Junwei Feng; Zhen Wang

doi:10.1016/j.ipm.2025.104120

Improving multimodal fake news detection by leveraging cross-modal content correlation

Jiao Qiao, Xianghua Li, Chao Gao, Lianwei Wu, Junwei Feng, Zhen Wang

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

The widespread presence of multimodal fake news on social media platforms has severely impacted public order, making the automatic detection and filtering of such content a pressing issue. Although existing studies have attempted to integrate multimodal data for this task, they often struggle to effectively model cross-modal correlations. Most approaches focus on the global features of each modality and compute scalar similarities, which limits their capacity to learn and process comprehensive samples. To address this challenge, this paper introduces a novel cross-modal content correlation network. This method leverages salient objects from images and nouns from the text as the multimodal content, utilizing CLIP to extract generalizable features for similarity measurement, thereby enhancing cross-modal interaction. By applying convolution to the similarity matrix between nouns and image crops, the model captures learnable patterns of cross-modal content correlations that facilitate news classification, without relying on predefined scalar similarities or requiring supplementary information or auxiliary tasks. Experiments on two real-world datasets reveal that our method outperforms previous methods, achieving 3.1% and 1.9% gains in overall accuracy on Weibo and Twitter, respectively. The source code is available at https://github.com/cgao-comp/C3N.

Original language	English
Article number	104120
Journal	Information Processing and Management
Volume	62
Issue number	5
DOIs	https://doi.org/10.1016/j.ipm.2025.104120
State	Published - Sep 2025

Keywords

Fake news detection
Multimodal learning
Neural network
Social network

Access to Document

10.1016/j.ipm.2025.104120

Cite this

@article{04e5d73f4e2d484c83740c89cdde39d8,

title = "Improving multimodal fake news detection by leveraging cross-modal content correlation",

abstract = "The widespread presence of multimodal fake news on social media platforms has severely impacted public order, making the automatic detection and filtering of such content a pressing issue. Although existing studies have attempted to integrate multimodal data for this task, they often struggle to effectively model cross-modal correlations. Most approaches focus on the global features of each modality and compute scalar similarities, which limits their capacity to learn and process comprehensive samples. To address this challenge, this paper introduces a novel cross-modal content correlation network. This method leverages salient objects from images and nouns from the text as the multimodal content, utilizing CLIP to extract generalizable features for similarity measurement, thereby enhancing cross-modal interaction. By applying convolution to the similarity matrix between nouns and image crops, the model captures learnable patterns of cross-modal content correlations that facilitate news classification, without relying on predefined scalar similarities or requiring supplementary information or auxiliary tasks. Experiments on two real-world datasets reveal that our method outperforms previous methods, achieving 3.1% and 1.9% gains in overall accuracy on Weibo and Twitter, respectively. The source code is available at https://github.com/cgao-comp/C3N.",

keywords = "Fake news detection, Multimodal learning, Neural network, Social network",

author = "Jiao Qiao and Xianghua Li and Chao Gao and Lianwei Wu and Junwei Feng and Zhen Wang",

note = "Publisher Copyright: {\textcopyright} 2025 Elsevier Ltd",

year = "2025",

month = sep,

doi = "10.1016/j.ipm.2025.104120",

language = "英语",

volume = "62",

journal = "Information Processing and Management",

issn = "0306-4573",

publisher = "Elsevier Ltd",

number = "5",

}

TY - JOUR

T1 - Improving multimodal fake news detection by leveraging cross-modal content correlation

AU - Qiao, Jiao

AU - Li, Xianghua

AU - Gao, Chao

AU - Wu, Lianwei

AU - Feng, Junwei

AU - Wang, Zhen

PY - 2025/9

Y1 - 2025/9

N2 - The widespread presence of multimodal fake news on social media platforms has severely impacted public order, making the automatic detection and filtering of such content a pressing issue. Although existing studies have attempted to integrate multimodal data for this task, they often struggle to effectively model cross-modal correlations. Most approaches focus on the global features of each modality and compute scalar similarities, which limits their capacity to learn and process comprehensive samples. To address this challenge, this paper introduces a novel cross-modal content correlation network. This method leverages salient objects from images and nouns from the text as the multimodal content, utilizing CLIP to extract generalizable features for similarity measurement, thereby enhancing cross-modal interaction. By applying convolution to the similarity matrix between nouns and image crops, the model captures learnable patterns of cross-modal content correlations that facilitate news classification, without relying on predefined scalar similarities or requiring supplementary information or auxiliary tasks. Experiments on two real-world datasets reveal that our method outperforms previous methods, achieving 3.1% and 1.9% gains in overall accuracy on Weibo and Twitter, respectively. The source code is available at https://github.com/cgao-comp/C3N.

AB - The widespread presence of multimodal fake news on social media platforms has severely impacted public order, making the automatic detection and filtering of such content a pressing issue. Although existing studies have attempted to integrate multimodal data for this task, they often struggle to effectively model cross-modal correlations. Most approaches focus on the global features of each modality and compute scalar similarities, which limits their capacity to learn and process comprehensive samples. To address this challenge, this paper introduces a novel cross-modal content correlation network. This method leverages salient objects from images and nouns from the text as the multimodal content, utilizing CLIP to extract generalizable features for similarity measurement, thereby enhancing cross-modal interaction. By applying convolution to the similarity matrix between nouns and image crops, the model captures learnable patterns of cross-modal content correlations that facilitate news classification, without relying on predefined scalar similarities or requiring supplementary information or auxiliary tasks. Experiments on two real-world datasets reveal that our method outperforms previous methods, achieving 3.1% and 1.9% gains in overall accuracy on Weibo and Twitter, respectively. The source code is available at https://github.com/cgao-comp/C3N.

KW - Fake news detection

KW - Multimodal learning

KW - Neural network

KW - Social network

UR - http://www.scopus.com/inward/record.url?scp=105001739658&partnerID=8YFLogxK

U2 - 10.1016/j.ipm.2025.104120

DO - 10.1016/j.ipm.2025.104120

M3 - 文章

AN - SCOPUS:105001739658

SN - 0306-4573

VL - 62

JO - Information Processing and Management

JF - Information Processing and Management

IS - 5

M1 - 104120

ER -

Improving multimodal fake news detection by leveraging cross-modal content correlation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this