3 research outputs found
Knowledge-Enhanced Hierarchical Information Correlation Learning for Multi-Modal Rumor Detection
The explosive growth of rumors with text and images on social media platforms
has drawn great attention. Existing studies have made significant contributions
to cross-modal information interaction and fusion, but they fail to fully
explore hierarchical and complex semantic correlation across different modality
content, severely limiting their performance on detecting multi-modal rumor. In
this work, we propose a novel knowledge-enhanced hierarchical information
correlation learning approach (KhiCL) for multi-modal rumor detection by
jointly modeling the basic semantic correlation and high-order
knowledge-enhanced entity correlation. Specifically, KhiCL exploits cross-modal
joint dictionary to transfer the heterogeneous unimodality features into the
common feature space and captures the basic cross-modal semantic consistency
and inconsistency by a cross-modal fusion layer. Moreover, considering the
description of multi-modal content is narrated around entities, KhiCL extracts
visual and textual entities from images and text, and designs a knowledge
relevance reasoning strategy to find the shortest semantic relevant path
between each pair of entities in external knowledge graph, and absorbs all
complementary contextual knowledge of other connected entities in this path for
learning knowledge-enhanced entity representations. Furthermore, KhiCL utilizes
a signed attention mechanism to model the knowledge-enhanced entity consistency
and inconsistency of intra-modality and inter-modality entity pairs by
measuring their corresponding semantic relevant distance. Extensive experiments
have demonstrated the effectiveness of the proposed method
Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources
Misinformation is now a major problem due to its potential high risks to our core democratic and societal values and orders. Out-of-context misinformation is one of the easiest and effective ways used by adversaries to spread viral false stories. In this threat, a real image is re-purposed to support other narratives by misrepresenting its context and/or elements. The internet is being used as the go-to way to verify information using different sources and modalities. Our goal is an inspectable method that automates this time-consuming and reasoning-intensive process by fact-checking the image-caption pairing using Web evidence. To integrate evidence and cues from both modalities, we introduce the concept of 'multi-modal cycle-consistency check'; starting from the image/caption, we gather textual/visual evidence, which will be compared against the other paired caption/image, respectively. Moreover, we propose a novel architecture, Consistency-Checking Network (CCN), that mimics the layered human reasoning across the same and different modalities: the caption vs. textual evidence, the image vs. visual evidence, and the image vs. caption. Our work offers the first step and benchmark for open-domain, content-based, multi-modal fact-checking, and significantly outperforms previous baselines that did not leverage external evidence
Bootstrapping Multi-view Representations for Fake News Detection
Previous researches on multimedia fake news detection include a series of
complex feature extraction and fusion networks to gather useful information
from the news. However, how cross-modal consistency relates to the fidelity of
news and how features from different modalities affect the decision-making are
still open questions. This paper presents a novel scheme of Bootstrapping
Multi-view Representations (BMR) for fake news detection. Given a multi-modal
news, we extract representations respectively from the views of the text, the
image pattern and the image semantics. Improved Multi-gate Mixture-of-Expert
networks (iMMoE) are proposed for feature refinement and fusion.
Representations from each view are separately used to coarsely predict the
fidelity of the whole news, and the multimodal representations are able to
predict the cross-modal consistency. With the prediction scores, we reweigh
each view of the representations and bootstrap them for fake news detection.
Extensive experiments conducted on typical fake news detection datasets prove
that the proposed BMR outperforms state-of-the-art schemes.Comment: Authors are from Fudan University, China. Under Revie