2,108 research outputs found
Exploring Image Virality in Google Plus
Reactions to posts in an online social network show different dynamics
depending on several textual features of the corresponding content. Do similar
dynamics exist when images are posted? Exploiting a novel dataset of posts,
gathered from the most popular Google+ users, we try to give an answer to such
a question. We describe several virality phenomena that emerge when taking into
account visual characteristics of images (such as orientation, mean saturation,
etc.). We also provide hypotheses and potential explanations for the dynamics
behind them, and include cases for which common-sense expectations do not hold
true in our experiments.Comment: 8 pages, 8 figures. IEEE/ASE SocialCom 201
The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models
A substantial gap persists in understanding the reasons behind the
exceptional performance of the Transformer architecture in NLP. A particularly
unexplored area involves the mechanistic description of how the distribution of
parameters evolves over time during training. In this work we suggest that
looking at the time evolution of the statistic distribution of model
parameters, and specifically at bifurcation effects, can help understanding the
model quality, potentially reducing training costs and evaluation efforts and
empirically showing the reasons behind the effectiveness of weights
sparsification.Comment: 15 page
Deep Feelings: A Massive Cross-Lingual Study on the Relation between Emotions and Virality
ABSTRACT This article provides a comprehensive investigation on the relations between virality of news articles and the emotions they are found to evoke. Virality, in our view, is a phenomenon with many facets, i.e. under this generic term several different effects of persuasive communication are comprised. By exploiting a high-coverage and bilingual corpus of documents containing metrics of their spread on social networks as well as a massive affective annotation provided by readers, we present a thorough analysis of the interplay between evoked emotions and viral facets. We highlight and discuss our findings in light of a cross-lingual approach: while we discover differences in evoked emotions and corresponding viral effects, we provide preliminary evidence of a generalized explanatory model rooted in the deep structure of emotions: the Valence-Arousal-Dominance (VAD) circumplex. We find that viral facets appear to be consistently affected by particular VAD configurations, and these configurations indicate a clear connection with distinct phenomena underlying persuasive communication
Toward Stance-based Personas for Opinionated Dialogues
In the context of chit-chat dialogues it has been shown that endowing systems
with a persona profile is important to produce more coherent and meaningful
conversations. Still, the representation of such personas has thus far been
limited to a fact-based representation (e.g. "I have two cats."). We argue that
these representations remain superficial w.r.t. the complexity of human
personality. In this work, we propose to make a step forward and investigate
stance-based persona, trying to grasp more profound characteristics, such as
opinions, values, and beliefs to drive language generation. To this end, we
introduce a novel dataset allowing to explore different stance-based persona
representations and their impact on claim generation, showing that they are
able to grasp abstract and profound aspects of the author persona.Comment: Accepted at Findings of EMNLP 202
SALSA: A Novel Dataset for Multimodal Group Behavior Analysis
Studying free-standing conversational groups (FCGs) in unstructured social
settings (e.g., cocktail party ) is gratifying due to the wealth of information
available at the group (mining social networks) and individual (recognizing
native behavioral and personality traits) levels. However, analyzing social
scenes involving FCGs is also highly challenging due to the difficulty in
extracting behavioral cues such as target locations, their speaking activity
and head/body pose due to crowdedness and presence of extreme occlusions. To
this end, we propose SALSA, a novel dataset facilitating multimodal and
Synergetic sociAL Scene Analysis, and make two main contributions to research
on automated social interaction analysis: (1) SALSA records social interactions
among 18 participants in a natural, indoor environment for over 60 minutes,
under the poster presentation and cocktail party contexts presenting
difficulties in the form of low-resolution images, lighting variations,
numerous occlusions, reverberations and interfering sound sources; (2) To
alleviate these problems we facilitate multimodal analysis by recording the
social interplay using four static surveillance cameras and sociometric badges
worn by each participant, comprising the microphone, accelerometer, bluetooth
and infrared sensors. In addition to raw data, we also provide annotations
concerning individuals' personality as well as their position, head, body
orientation and F-formation information over the entire event duration. Through
extensive experiments with state-of-the-art approaches, we show (a) the
limitations of current methods and (b) how the recorded multiple cues
synergetically aid automatic analysis of social interactions. SALSA is
available at http://tev.fbk.eu/salsa.Comment: 14 pages, 11 figure
Glitter or Gold? Deriving Structured Insights from Sustainability Reports via Large Language Models
Over the last decade, several regulatory bodies have started requiring the
disclosure of non-financial information from publicly listed companies, in
light of the investors' increasing attention to Environmental, Social, and
Governance (ESG) issues. Publicly released information on sustainability
practices is often disclosed in diverse, unstructured, and multi-modal
documentation. This poses a challenge in efficiently gathering and aligning the
data into a unified framework to derive insights related to Corporate Social
Responsibility (CSR). Thus, using Information Extraction (IE) methods becomes
an intuitive choice for delivering insightful and actionable data to
stakeholders. In this study, we employ Large Language Models (LLMs), In-Context
Learning, and the Retrieval-Augmented Generation (RAG) paradigm to extract
structured insights related to ESG aspects from companies' sustainability
reports. We then leverage graph-based representations to conduct statistical
analyses concerning the extracted insights. These analyses revealed that ESG
criteria cover a wide range of topics, exceeding 500, often beyond those
considered in existing categorizations, and are addressed by companies through
a variety of initiatives. Moreover, disclosure similarities emerged among
companies from the same region or sector, validating ongoing hypotheses in the
ESG literature. Lastly, by incorporating additional company attributes into our
analyses, we investigated which factors impact the most on companies' ESG
ratings, showing that ESG disclosure affects the obtained ratings more than
other financial or company data
Countering Misinformation via Emotional Response Generation
The proliferation of misinformation on social media platforms (SMPs) poses a
significant danger to public health, social cohesion and ultimately democracy.
Previous research has shown how social correction can be an effective way to
curb misinformation, by engaging directly in a constructive dialogue with users
who spread -- often in good faith -- misleading messages. Although professional
fact-checkers are crucial to debunking viral claims, they usually do not engage
in conversations on social media. Thereby, significant effort has been made to
automate the use of fact-checker material in social correction; however, no
previous work has tried to integrate it with the style and pragmatics that are
commonly employed in social media communication. To fill this gap, we present
VerMouth, the first large-scale dataset comprising roughly 12 thousand
claim-response pairs (linked to debunking articles), accounting for both
SMP-style and basic emotions, two factors which have a significant role in
misinformation credibility and spreading. To collect this dataset we used a
technique based on an author-reviewer pipeline, which efficiently combines LLMs
and human annotators to obtain high-quality data. We also provide comprehensive
experiments showing how models trained on our proposed dataset have significant
improvements in terms of output quality and generalization capabilities.Comment: Accepted to EMNLP 2023 main conferenc
- …