9 research outputs found
Exploring Spillover Effects for COVID-19 Cascade Prediction
An information outbreak occurs on social media along with the COVID-19 pandemic and leads to an infodemic. Predicting the popularity of online content, known as cascade prediction, allows for not only catching in advance information that deserves attention, but also identifying false information that will widely spread and require quick response to mitigate its negative impact. Among the various information diffusion patterns leveraged in previous works, the spillover effect of the information exposed to users on their decisions to participate in diffusing certain information has not been studied. In this paper, we focus on the diffusion of information related to COVID-19 preventive measures due to its special role in consolidating public efforts to slow down the spread of the virus. Through our collected Twitter dataset, we validate the existence of the spillover effects. Building on this finding, we propose extensions to three cascade prediction methods based on Graph Neural Networks (GNNs). Experiments conducted on our dataset demonstrated that the use of the identified spillover effects significantly improves the state-of-the-art GNN methods in predicting the popularity of not only preventive measure messages, but also other COVID-19 messages
The Burden of Being a Bridge: Analysing Subjective Well-Being of Twitter Users During the COVID-19 Pandemic
The outbreak of the COVID-19 pandemic triggers infodemic over online social
media, which significantly impacts public health around the world, both
physically and psychologically. In this paper, we study the impact of the
pandemic on the mental health of influential social media users, whose sharing
behaviours significantly promote the diffusion of COVID-19 related information.
Specifically, we focus on subjective well-being (SWB), and analyse whether SWB
changes have a relationship with their bridging performance in information
diffusion, which measures the speed and wideness gain of information
transmission due to their sharing. We accurately capture users' bridging
performance by proposing a new measurement. Benefiting from deep-learning
natural language processing models, we quantify social media users' SWB from
their textual posts. With the data collected from Twitter for almost two years,
we reveal the greater mental suffering of influential users during the COVID-19
pandemic. Through comprehensive hierarchical multiple regression analysis, we
are the first to discover the strong {relationship} between social users' SWB
and their bridging performance
A multi-modal, multi-platform, and multi-lingual approach to understanding online misinformation
Due to online social media, access to information is becoming easier and easier. Meanwhile, the truthfulness of online information is often not guaranteed. Incorrect information, often called misinformation, can have several modalities, and it can spread to multiple social media platforms in different languages, which can be destructive to society. However, academia and industry do not have automated ways to assess the impact of misinformation on social media, preventing the adoption of productive strategies to curb the prevalence of misinformation. In this dissertation, I present my research to build computational pipelines that help measuring and detecting misinformation on social media. My work can be divided into three parts.
The first part focuses on processing misinformation in text form. I first show how to group political news articles from both trustworthy and untrustworthy news outlets into stories. Then I present a measurement analysis on the spread of stories to characterize how mainstream and fringe Web communities influence each other.
The second part is related to analyzing image-based misinformation. It can be further divided into two parts: fauxtography and generic image misinformation. Fauxtography is a special type of image misinformation, where images are manipulated or used out-of-context. In this research, I present how to identify fauxtography on social media by using a fact-checking website (Snopes.com), and I also develop a computational pipeline to facilitate the measurement of these images at scale. I next focus on generic misinformation images related to COVID-19. During the pandemic, text misinformation has been studied in many aspects. However, very little research has covered image misinformation during the COVID-19 pandemic. In this research, I develop a technique to cluster visually similar images together, facilitating manual annotation, to make subsequent analysis possible.
The last part is about the detection of misinformation in text form following a multi-language perspective. This research aims to detect textual COVID-19 related misinformation and what stances Twitter users have towards such misinformation in both English and Chinese. To achieve this goal, I experiment on several natural language processing (NLP) models to investigate their performance on misinformation detection and stance detection in both monolingual and multi-lingual manners. The results show that two models: COVID-Tweet-BERT v2 and BERTweet are generally effective in detecting misinformation and stance in the two above manners. These two models are promising to be applied to misinformation moderation on social media platforms, which heavily depends on identifying misinformation and stance of the author towards this piece of misinformation.
Overall, the results of this dissertation shed light on understanding of online misinformation, and my proposed computational tools are applicable to moderation of social media, potentially benefitting for a more wholesome online ecosystem
Mapping (Dis-)Information Flow about the MH17 Plane Crash
Digital media enables not only fast sharing of information, but also
disinformation. One prominent case of an event leading to circulation of
disinformation on social media is the MH17 plane crash. Studies analysing the
spread of information about this event on Twitter have focused on small,
manually annotated datasets, or used proxys for data annotation. In this work,
we examine to what extent text classifiers can be used to label data for
subsequent content analysis, in particular we focus on predicting pro-Russian
and pro-Ukrainian Twitter content related to the MH17 plane crash. Even though
we find that a neural classifier improves over a hashtag based baseline,
labeling pro-Russian and pro-Ukrainian content with high precision remains a
challenging problem. We provide an error analysis underlining the difficulty of
the task and identify factors that might help improve classification in future
work. Finally, we show how the classifier can facilitate the annotation task
for human annotators
Recommended from our members
Large-scale Affective Computing for Visual Multimedia
In recent years, Affective Computing has arisen as a prolific interdisciplinary field for engineering systems that integrate human affections. While human-computer relationships have long revolved around cognitive interactions, it is becoming increasingly important to account for human affect, or feelings or emotions, to avert user experience frustration, provide disability services, predict virality of social media content, etc. In this thesis, we specifically focus on Affective Computing as it applies to large-scale visual multimedia, and in particular, still images, animated image sequences and video streams, above and beyond the traditional approaches of face expression and gesture recognition. By taking a principled psychology-grounded approach, we seek to paint a more holistic and colorful view of computational affect in the context of visual multimedia. For example, should emotions like 'surprise' and `fear' be assumed to be orthogonal output dimensions? Or does a 'positive' image in one culture's view elicit the same feelings of positivity in another culture? We study affect frameworks and ontologies to define, organize and develop machine learning models with such questions in mind to automatically detect affective visual concepts.
In the push for what we call "Big Affective Computing," we focus on two dimensions of scale for affect -- scaling up and scaling out -- which we propose are both imperative if we are to scale the Affective Computing problem successfully. Intuitively, simply increasing the number of data points corresponds to "scaling up". However, less intuitive, is when problems like Affective Computing "scale out," or diversify. We show that this latter dimension of introducing data variety, alongside the former of introducing data volume, can yield particular insights since human affections naturally depart from traditional Machine Learning and Computer Vision problems where there is an objectively truthful target. While no one might debate a picture of a 'dog' should be tagged as a 'dog,' but not all may agree that it looks 'ugly'. We present extensive discussions on why scaling out is critical and how it can be accomplished while in the context of large-volume visual data.
At a high-level, the main contributions of this thesis include:
Multiplicity of Affect Oracles:
Prior to the work in this thesis, little consideration has been paid to the affective label generating mechanism when learning functional mappings between inputs and labels. Throughout this thesis but first in Chapter 2, starting in Section 2.1.2, we make a case for a conceptual partitioning of the affect oracle governing the label generation process in Affective Computing problems resulting a multiplicity of oracles, whereas prior works assumed there was a single universal oracle. In Chapter 3, the differences between intended versus expressed versus induced versus perceived emotion are discussed, where we argue that perceived emotion is particularly well-suited for scaling up because it reduces the label variance due to its more objective nature compared to other affect states. And in Chapter 4 and 5, a division of the affect oracle along cultural lines with manifestations along both language and geography is explored. We accomplish all this without sacrificing the 'scale up' dimension, and tackle significantly larger volume problems than prior comparable visual affective computing research.
Content-driven Visual Affect Detection:
Traditionally, in most Affective Computing work, prediction tasks use psycho-physiological signals from subjects viewing the stimuli of interest, e.g., a video advertisement, as the system inputs. In essence, this means that the machine learns to label a proxy signal rather than the stimuli itself. In this thesis, with the rise of strong Computer Vision and Multimedia techniques, we focus on the learning to label the stimuli directly without a human subject provided biometric proxy signal (except in the unique circumstances of Chapter 7). This shift toward learning from the stimuli directly is important because it allows us to scale up with much greater ease given that biometric measurement acquisition is both low-throughput and somewhat invasive while stimuli are often readily available. In addition, moving toward learning directly from the stimuli will allow researchers to precisely determine which low-level features in the stimuli are actually coupled with affect states, e.g., which set of frames caused viewer discomfort rather a broad sense that a video was discomforting. In Part I of this thesis, we illustrate an emotion prediction task with a psychology-grounded affect representation. In particular, in Chapter 3, we develop a prediction task over semantic emotional classes, e.g., 'sad,' 'happy' and 'angry,' using animated image sequences given annotations from over 2.5 million users. Subsequently, in Part II, we develop visual sentiment and adjective-based semantics models from million-scale digital imagery mined from a social multimedia platform.
Mid-level Representations for Visual Affect:
While discrete semantic emotions and sentiment are classical representations of affect with decades of psychology grounding, the interdisciplinary nature of Affective Computing, now only about two decades old, allows for new avenues of representation. Mid-level representations have been proposed in numerous Computer Vision and Multimedia problems as an intermediary, and often more computable, step toward bridging the semantic gap between low-level system inputs and high-level label semantic abstractions. In Part II, inspired by this work, we adapt it for vision-based Affective Computing and adopt a semantic construct called adjective-noun pairs. Specifically, in Chapter 4, we explore the use of such adjective-noun pairs in the context of a social multimedia platform and develop a multilingual visual sentiment ontology with over 15,000 affective mid-level visual concepts across 12 languages associated with over 7.3 million images and representations from over 235 countries, resulting in the largest affective digital image corpus in both depth and breadth to date. In Chapter 5, we develop computational methods to predict such adjective-noun pairs and also explore their usefulness in traditional sentiment analysis but with a previously unexplored cross-lingual perspective. And in Chapter 6, we propose a new learning setting called 'cross-residual learning' building off recent successes in deep neural networks, and specifically, in residual learning; we show that cross-residual learning can be used effectively to jointly learn across even multiple related tasks in object detection (noun), more traditional affect modeling (adjectives), and affective mid-level representations (adjective-noun pairs), giving us a framework for better grounding the adjective-noun pair bridge in both vision and affect simultaneously
Detecting Political Framing Shifts and the Adversarial Phrases within\\ Rival Factions and Ranking Temporal Snapshot Contents in Social Media
abstract: Social Computing is an area of computer science concerned with dynamics of communities and cultures, created through computer-mediated social interaction. Various social media platforms, such as social network services and microblogging, enable users to come together and create social movements expressing their opinions on diverse sets of issues, events, complaints, grievances, and goals. Methods for monitoring and summarizing these types of sociopolitical trends, its leaders and followers, messages, and dynamics are needed. In this dissertation, a framework comprising of community and content-based computational methods is presented to provide insights for multilingual and noisy political social media content. First, a model is developed to predict the emergence of viral hashtag breakouts, using network features. Next, another model is developed to detect and compare individual and organizational accounts, by using a set of domain and language-independent features. The third model exposes contentious issues, driving reactionary dynamics between opposing camps. The fourth model develops community detection and visualization methods to reveal underlying dynamics and key messages that drive dynamics. The final model presents a use case methodology for detecting and monitoring foreign influence, wherein a state actor and news media under its control attempt to shift public opinion by framing information to support multiple adversarial narratives that facilitate their goals. In each case, a discussion of novel aspects and contributions of the models is presented, as well as quantitative and qualitative evaluations. An analysis of multiple conflict situations will be conducted, covering areas in the UK, Bangladesh, Libya and the Ukraine where adversarial framing lead to polarization, declines in social cohesion, social unrest, and even civil wars (e.g., Libya and the Ukraine).Dissertation/ThesisDoctoral Dissertation Computer Science 201
Social informatics
5th International Conference, SocInfo 2013, Kyoto, Japan, November 25-27, 2013, Proceedings</p