10 research outputs found
Findings of Factify 2: Multimodal Fake News Detection
With social media usage growing exponentially in the past few years, fake
news has also become extremely prevalent. The detrimental impact of fake news
emphasizes the need for research focused on automating the detection of false
information and verifying its accuracy. In this work, we present the outcome of
the Factify 2 shared task, which provides a multi-modal fact verification and
satire news dataset, as part of the DeFactify 2 workshop at AAAI'23. The data
calls for a comparison based approach to the task by pairing social media
claims with supporting documents, with both text and image, divided into 5
classes based on multi-modal relations. In the second iteration of this task we
had over 60 participants and 9 final test-set submissions. The best
performances came from the use of DeBERTa for text and Swinv2 and CLIP for
image. The highest F1 score averaged for all five classes was 81.82%.Comment: Defactify2 @AAAI 202
Overview of Memotion 3: Sentiment and Emotion Analysis of Codemixed Hinglish Memes
Analyzing memes on the internet has emerged as a crucial endeavor due to the
impact this multi-modal form of content wields in shaping online discourse.
Memes have become a powerful tool for expressing emotions and sentiments,
possibly even spreading hate and misinformation, through humor and sarcasm. In
this paper, we present the overview of the Memotion 3 shared task, as part of
the DeFactify 2 workshop at AAAI-23. The task released an annotated dataset of
Hindi-English code-mixed memes based on their Sentiment (Task A), Emotion (Task
B), and Emotion intensity (Task C). Each of these is defined as an individual
task and the participants are ranked separately for each task. Over 50 teams
registered for the shared task and 5 made final submissions to the test set of
the Memotion 3 dataset. CLIP, BERT modifications, ViT etc. were the most
popular models among the participants along with approaches such as
Student-Teacher model, Fusion, and Ensembling. The best final F1 score for Task
A is 34.41, Task B is 79.77 and Task C is 59.82.Comment: Defactify2 @AAAI 202
Factify 2: A Multimodal Fake News and Satire News Dataset
The internet gives the world an open platform to express their views and
share their stories. While this is very valuable, it makes fake news one of our
society's most pressing problems. Manual fact checking process is time
consuming, which makes it challenging to disprove misleading assertions before
they cause significant harm. This is he driving interest in automatic fact or
claim verification. Some of the existing datasets aim to support development of
automating fact-checking techniques, however, most of them are text based.
Multi-modal fact verification has received relatively scant attention. In this
paper, we provide a multi-modal fact-checking dataset called FACTIFY 2,
improving Factify 1 by using new data sources and adding satire articles.
Factify 2 has 50,000 new data instances. Similar to FACTIFY 1.0, we have three
broad categories - support, no-evidence, and refute, with sub-categories based
on the entailment of visual and textual data. We also provide a BERT and Vison
Transformer based baseline, which acheives 65% F1 score in the test set. The
baseline codes and the dataset will be made available at
https://github.com/surya1701/Factify-2.0.Comment: Defactify@AAAI202
Semantic Interpretation of Social Network Communities
A community in a social network is considered to be a group of nodes densely connected internally and sparsely connected externally.Although previous work intensely studied network topology within a community, its semantic interpretation is hardly understood. In this paper, we attempt to understand whether individuals in a community possess similar Personalities, Values and Ethical background. Finally, we show that Personality and Values models could be used as features to discover more accurate community structure compared to the one obtained from only network information
Simplifying Distributed Neural Network Training on Massive Graphs: Randomized Partitions Improve Model Aggregation
Distributed training of GNNs enables learning on massive graphs (e.g., social
and e-commerce networks) that exceed the storage and computational capacity of
a single machine. To reach performance comparable to centralized training,
distributed frameworks focus on maximally recovering cross-instance node
dependencies with either communication across instances or periodic fallback to
centralized training, which create overhead and limit the framework
scalability. In this work, we present a simplified framework for distributed
GNN training that does not rely on the aforementioned costly operations, and
has improved scalability, convergence speed and performance over the
state-of-the-art approaches. Specifically, our framework (1) assembles
independent trainers, each of which asynchronously learns a local model on
locally-available parts of the training graph, and (2) only conducts periodic
(time-based) model aggregation to synchronize the local models. Backed by our
theoretical analysis, instead of maximizing the recovery of cross-instance node
dependencies -- which has been considered the key behind closing the
performance gap between model aggregation and centralized training -- , our
framework leverages randomized assignment of nodes or super-nodes (i.e.,
collections of original nodes) to partition the training graph such that it
improves data uniformity and minimizes the discrepancy of gradient and loss
function across instances. In our experiments on social and e-commerce networks
with up to 1.3 billion edges, our proposed RandomTMA and SuperTMA approaches --
despite using less training data -- achieve state-of-the-art performance and
2.31x speedup compared to the fastest baseline, and show better robustness to
trainer failures.Comment: 14 pages, 3 figure
ANALOGICAL - A New Benchmark for Analogy of Long Text for Large Language Models
Over the past decade, analogies, in the form of word-level analogies, have played a significant role as an intrinsic measure of evaluating the quality of word embedding methods such as word2vec. Modern large language models (LLMs), however, are primarily evaluated on extrinsic measures based on benchmarks such as GLUE and SuperGLUE, and there are only a few investigations on whether LLMs can draw analogies between long texts. In this paper, we present ANALOGICAL, a new benchmark to intrinsically evaluate LLMs across a taxonomy of analogies of long text with six levels of complexity – (i) word, (ii) word vs. sentence, (iii) syntactic, (iv) negation, (v) entailment, and (vi) metaphor. Using thirteen datasets and three different distance measures, we evaluate the abilities of eight LLMs in identifying analogical pairs in the semantic vector space (e.g., “I can speak two languages” should be closer to “I am bilingual” while “I like chocolate” and “I do not like chocolate” should be orthogonal). Our evaluation finds that it is increasingly challenging for LLMs to identify analogies when going up the analogy taxonomy
ANALOGICAL -- A New Benchmark for Analogy of Long Text for Large Language Models
Over the past decade, analogies, in the form of word-level analogies, have
played a significant role as an intrinsic measure of evaluating the quality of
word embedding methods such as word2vec. Modern large language models (LLMs),
however, are primarily evaluated on extrinsic measures based on benchmarks such
as GLUE and SuperGLUE, and there are only a few investigations on whether LLMs
can draw analogies between long texts. In this paper, we present ANALOGICAL, a
new benchmark to intrinsically evaluate LLMs across a taxonomy of analogies of
long text with six levels of complexity -- (i) word, (ii) word vs. sentence,
(iii) syntactic, (iv) negation, (v) entailment, and (vi) metaphor. Using
thirteen datasets and three different distance measures, we evaluate the
abilities of eight LLMs in identifying analogical pairs in the semantic vector
space. Our evaluation finds that it is increasingly challenging for LLMs to
identify analogies when going up the analogy taxonomy.Comment: Accepted as a long paper at Findings of ACL 202