1,580 research outputs found
Clustering Memes in Social Media
The increasing pervasiveness of social media creates new opportunities to
study human social behavior, while challenging our capability to analyze their
massive data streams. One of the emerging tasks is to distinguish between
different kinds of activities, for example engineered misinformation campaigns
versus spontaneous communication. Such detection problems require a formal
definition of meme, or unit of information that can spread from person to
person through the social network. Once a meme is identified, supervised
learning methods can be applied to classify different types of communication.
The appropriate granularity of a meme, however, is hardly captured from
existing entities such as tags and keywords. Here we present a framework for
the novel task of detecting memes by clustering messages from large streams of
social data. We evaluate various similarity measures that leverage content,
metadata, network features, and their combinations. We also explore the idea of
pre-clustering on the basis of existing entities. A systematic evaluation is
carried out using a manually curated dataset as ground truth. Our analysis
shows that pre-clustering and a combination of heterogeneous features yield the
best trade-off between number of clusters and their quality, demonstrating that
a simple combination based on pairwise maximization of similarity is as
effective as a non-trivial optimization of parameters. Our approach is fully
automatic, unsupervised, and scalable for real-time detection of memes in
streaming data.Comment: Proceedings of the 2013 IEEE/ACM International Conference on Advances
in Social Networks Analysis and Mining (ASONAM'13), 201
On the Evolution of (Hateful) Memes by Means of Multimodal Contrastive Learning
The dissemination of hateful memes online has adverse effects on social media
platforms and the real world. Detecting hateful memes is challenging, one of
the reasons being the evolutionary nature of memes; new hateful memes can
emerge by fusing hateful connotations with other cultural ideas or symbols. In
this paper, we propose a framework that leverages multimodal contrastive
learning models, in particular OpenAI's CLIP, to identify targets of hateful
content and systematically investigate the evolution of hateful memes. We find
that semantic regularities exist in CLIP-generated embeddings that describe
semantic relationships within the same modality (images) or across modalities
(images and text). Leveraging this property, we study how hateful memes are
created by combining visual elements from multiple images or fusing textual
information with a hateful image. We demonstrate the capabilities of our
framework for analyzing the evolution of hateful memes by focusing on
antisemitic memes, particularly the Happy Merchant meme. Using our framework on
a dataset extracted from 4chan, we find 3.3K variants of the Happy Merchant
meme, with some linked to specific countries, persons, or organizations. We
envision that our framework can be used to aid human moderators by flagging new
variants of hateful memes so that moderators can manually verify them and
mitigate the problem of hateful content online.Comment: To Appear in the 44th IEEE Symposium on Security and Privacy, May
22-25, 202
TotalDefMeme: A Multi-Attribute Meme dataset on Total Defence in Singapore
Total Defence is a defence policy combining and extending the concept of
military defence and civil defence. While several countries have adopted total
defence as their defence policy, very few studies have investigated its
effectiveness. With the rapid proliferation of social media and digitalisation,
many social studies have been focused on investigating policy effectiveness
through specially curated surveys and questionnaires either through digital
media or traditional forms. However, such references may not truly reflect the
underlying sentiments about the target policies or initiatives of interest.
People are more likely to express their sentiment using communication mediums
such as starting topic thread on forums or sharing memes on social media. Using
Singapore as a case reference, this study aims to address this research gap by
proposing TotalDefMeme, a large-scale multi-modal and multi-attribute meme
dataset that captures public sentiments toward Singapore's Total Defence
policy. Besides supporting social informatics and public policy analysis of the
Total Defence policy, TotalDefMeme can also support many downstream multi-modal
machine learning tasks, such as aspect-based stance classification and
multi-modal meme clustering. We perform baseline machine learning experiments
on TotalDefMeme and evaluate its technical validity, and present possible
future interdisciplinary research directions and application scenarios using
the dataset as a baseline.Comment: 6 pages. Accepted at ACM MMSys 202
Cluster-based Deep Ensemble Learning for Emotion Classification in Internet Memes
Memes have gained popularity as a means to share visual ideas through the
Internet and social media by mixing text, images and videos, often for humorous
purposes. Research enabling automated analysis of memes has gained attention in
recent years, including among others the task of classifying the emotion
expressed in memes. In this paper, we propose a novel model, cluster-based deep
ensemble learning (CDEL), for emotion classification in memes. CDEL is a hybrid
model that leverages the benefits of a deep learning model in combination with
a clustering algorithm, which enhances the model with additional information
after clustering memes with similar facial features. We evaluate the
performance of CDEL on a benchmark dataset for emotion classification, proving
its effectiveness by outperforming a wide range of baseline models and
achieving state-of-the-art performance. Further evaluation through ablated
models demonstrates the effectiveness of the different components of CDEL
Online Popularity and Topical Interests through the Lens of Instagram
Online socio-technical systems can be studied as proxy of the real world to
investigate human behavior and social interactions at scale. Here we focus on
Instagram, a media-sharing online platform whose popularity has been rising up
to gathering hundred millions users. Instagram exhibits a mixture of features
including social structure, social tagging and media sharing. The network of
social interactions among users models various dynamics including
follower/followee relations and users' communication by means of
posts/comments. Users can upload and tag media such as photos and pictures, and
they can "like" and comment each piece of information on the platform. In this
work we investigate three major aspects on our Instagram dataset: (i) the
structural characteristics of its network of heterogeneous interactions, to
unveil the emergence of self organization and topically-induced community
structure; (ii) the dynamics of content production and consumption, to
understand how global trends and popular users emerge; (iii) the behavior of
users labeling media with tags, to determine how they devote their attention
and to explore the variety of their topical interests. Our analysis provides
clues to understand human behavior dynamics on socio-technical systems,
specifically users and content popularity, the mechanisms of users'
interactions in online environments and how collective trends emerge from
individuals' topical interests.Comment: 11 pages, 11 figures, Proceedings of ACM Hypertext 201
On the Origins of Memes by Means of Fringe Web Communities
Internet memes are increasingly used to sway and manipulate public opinion.
This prompts the need to study their propagation, evolution, and influence
across the Web. In this paper, we detect and measure the propagation of memes
across multiple Web communities, using a processing pipeline based on
perceptual hashing and clustering techniques, and a dataset of 160M images from
2.6B posts gathered from Twitter, Reddit, 4chan's Politically Incorrect board
(/pol/), and Gab, over the course of 13 months. We group the images posted on
fringe Web communities (/pol/, Gab, and The_Donald subreddit) into clusters,
annotate them using meme metadata obtained from Know Your Meme, and also map
images from mainstream communities (Twitter and Reddit) to the clusters.
Our analysis provides an assessment of the popularity and diversity of memes
in the context of each community, showing, e.g., that racist memes are
extremely common in fringe Web communities. We also find a substantial number
of politics-related memes on both mainstream and fringe Web communities,
supporting media reports that memes might be used to enhance or harm
politicians. Finally, we use Hawkes processes to model the interplay between
Web communities and quantify their reciprocal influence, finding that /pol/
substantially influences the meme ecosystem with the number of memes it
produces, while \td has a higher success rate in pushing them to other
communities.Comment: A shorter version of this paper appears in the Proceedings of 18th
ACM Internet Measurement Conference (IMC 2018). This is the full versio
Timescales of Massive Human Entrainment
The past two decades have seen an upsurge of interest in the collective
behaviors of complex systems composed of many agents entrained to each other
and to external events. In this paper, we extend concepts of entrainment to the
dynamics of human collective attention. We conducted a detailed investigation
of the unfolding of human entrainment - as expressed by the content and
patterns of hundreds of thousands of messages on Twitter - during the 2012 US
presidential debates. By time locking these data sources, we quantify the
impact of the unfolding debate on human attention. We show that collective
social behavior covaries second-by-second to the interactional dynamics of the
debates: A candidate speaking induces rapid increases in mentions of his name
on social media and decreases in mentions of the other candidate. Moreover,
interruptions by an interlocutor increase the attention received. We also
highlight a distinct time scale for the impact of salient moments in the
debate: Mentions in social media start within 5-10 seconds after the moment;
peak at approximately one minute; and slowly decay in a consistent fashion
across well-known events during the debates. Finally, we show that public
attention after an initial burst slowly decays through the course of the
debates. Thus we demonstrate that large-scale human entrainment may hold across
a number of distinct scales, in an exquisitely time-locked fashion. The methods
and results pave the way for careful study of the dynamics and mechanisms of
large-scale human entrainment.Comment: 20 pages, 7 figures, 6 tables, 4 supplementary figures. 2nd version
revised according to peer reviewers' comments: more detailed explanation of
the methods, and grounding of the hypothese
- …