103 research outputs found
Connecting Dream Networks Across Cultures
Many species dream, yet there remain many open research questions in the
study of dreams. The symbolism of dreams and their interpretation is present in
cultures throughout history. Analysis of online data sources for dream
interpretation using network science leads to understanding symbolism in dreams
and their associated meaning. In this study, we introduce dream interpretation
networks for English, Chinese and Arabic that represent different cultures from
various parts of the world. We analyze communities in these networks, finding
that symbols within a community are semantically related. The central nodes in
communities give insight about cultures and symbols in dreams. The community
structure of different networks highlights cultural similarities and
differences. Interconnections between different networks are also identified by
translating symbols from different languages into English. Structural
correlations across networks point out relationships between cultures.
Similarities between network communities are also investigated by analysis of
sentiment in symbol interpretations. We find that interpretations within a
community tend to have similar sentiment. Furthermore, we cluster communities
based on their sentiment, yielding three main categories of positive, negative,
and neutral dream symbols.Comment: 6 pages, 3 figure
Should we agree to disagree about Twitter's bot problem?
Bots, simply defined as accounts controlled by automation, can be used as a
weapon for online manipulation and pose a threat to the health of platforms.
Researchers have studied online platforms to detect, estimate, and characterize
bot accounts. Concerns about the prevalence of bots were raised following Elon
Musk's bid to acquire Twitter. Twitter's recent estimate that 5\% of
monetizable daily active users being bot accounts raised questions about their
methodology. This estimate is based on a specific number of active users and
relies on Twitter's criteria for bot accounts. In this work, we want to stress
that crucial questions need to be answered in order to make a proper estimation
and compare different methodologies. We argue how assumptions on bot-likely
behavior, the detection approach, and the population inspected can affect the
estimation of the percentage of bots on Twitter. Finally, we emphasize the
responsibility of platforms to be vigilant, transparent, and unbiased in
dealing with threats that may affect their users.Comment: 22 pages, 5 figure
Traveling Trends: Social Butterflies or Frequent Fliers?
Trending topics are the online conversations that grab collective attention
on social media. They are continually changing and often reflect exogenous
events that happen in the real world. Trends are localized in space and time as
they are driven by activity in specific geographic areas that act as sources of
traffic and information flow. Taken independently, trends and geography have
been discussed in recent literature on online social media; although, so far,
little has been done to characterize the relation between trends and geography.
Here we investigate more than eleven thousand topics that trended on Twitter in
63 main US locations during a period of 50 days in 2013. This data allows us to
study the origins and pathways of trends, how they compete for popularity at
the local level to emerge as winners at the country level, and what dynamics
underlie their production and consumption in different geographic areas. We
identify two main classes of trending topics: those that surface locally,
coinciding with three different geographic clusters (East coast, Midwest and
Southwest); and those that emerge globally from several metropolitan areas,
coinciding with the major air traffic hubs of the country. These hubs act as
trendsetters, generating topics that eventually trend at the country level, and
driving the conversation across the country. This poses an intriguing
conjecture, drawing a parallel between the spread of information and diseases:
Do trends travel faster by airplane than over the Internet?Comment: Proceedings of the first ACM conference on Online social networks,
pp. 213-222, 201
Unsupervised detection of coordinated fake-follower campaigns on social media
Automated social media accounts, known as bots, are increasingly recognized
as key tools for manipulative online activities. These activities can stem from
coordination among several accounts and these automated campaigns can
manipulate social network structure by following other accounts, amplifying
their content, and posting messages to spam online discourse. In this study, we
present a novel unsupervised detection method designed to target a specific
category of malicious accounts designed to manipulate user metrics such as
online popularity. Our framework identifies anomalous following patterns among
all the followers of a social media account. Through the analysis of a large
number of accounts on the Twitter platform (rebranded as Twitter after the
acquisition of Elon Musk), we demonstrate that irregular following patterns are
prevalent and are indicative of automated fake accounts. Notably, we find that
these detected groups of anomalous followers exhibit consistent behavior across
multiple accounts. This observation, combined with the computational efficiency
of our proposed approach, makes it a valuable tool for investigating
large-scale coordinated manipulation campaigns on social media platforms.Comment: 17 pages, 5 figures, 1 table and supplementary informatio
TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis
Turkish is one of the most popular languages in the world. Wide us of this
language on social media platforms such as Twitter, Instagram, or Tiktok and
strategic position of the country in the world politics makes it appealing for
the social network researchers and industry. To address this need, we introduce
TurkishBERTweet, the first large scale pre-trained language model for Turkish
social media built using almost 900 million tweets. The model shares the same
architecture as base BERT model with smaller input length, making
TurkishBERTweet lighter than BERTurk and can have significantly lower inference
time. We trained our model using the same approach for RoBERTa model and
evaluated on two text classification tasks: Sentiment Classification and Hate
Speech Detection. We demonstrate that TurkishBERTweet outperforms the other
available alternatives on generalizability and its lower inference time gives
significant advantage to process large-scale datasets. We also compared our
models with the commercial OpenAI solutions in terms of cost and performance to
demonstrate TurkishBERTweet is scalable and cost-effective solution. As part of
our research, we released TurkishBERTweet and fine-tuned LoRA adapters for the
mentioned tasks under the MIT License to facilitate future research and
applications on Turkish social media. Our TurkishBERTweet model is available
at: https://github.com/ViralLab/TurkishBERTweetComment: 21 pages, 4 figures, 8 table
Evolution of Online User Behavior During a Social Upheaval
Social media represent powerful tools of mass communication and information
diffusion. They played a pivotal role during recent social uprisings and
political mobilizations across the world. Here we present a study of the Gezi
Park movement in Turkey through the lens of Twitter. We analyze over 2.3
million tweets produced during the 25 days of protest occurred between May and
June 2013. We first characterize the spatio-temporal nature of the conversation
about the Gezi Park demonstrations, showing that similarity in trends of
discussion mirrors geographic cues. We then describe the characteristics of the
users involved in this conversation and what roles they played. We study how
roles and individual influence evolved during the period of the upheaval. This
analysis reveals that the conversation becomes more democratic as events
unfold, with a redistribution of influence over time in the user population. We
conclude by observing how the online and offline worlds are tightly
intertwined, showing that exogenous events, such as political speeches or
police actions, affect social media conversations and trigger changes in
individual behavior.Comment: Best Paper Award at ACM Web Science 201
Online Human-Bot Interactions: Detection, Estimation, and Characterization
Increasing evidence suggests that a growing amount of social media content is
generated by autonomous entities known as social bots. In this work we present
a framework to detect such entities on Twitter. We leverage more than a
thousand features extracted from public data and meta-data about users:
friends, tweet content and sentiment, network patterns, and activity time
series. We benchmark the classification framework by using a publicly available
dataset of Twitter bots. This training data is enriched by a manually annotated
collection of active Twitter users that include both humans and bots of varying
sophistication. Our models yield high accuracy and agreement with each other
and can detect bots of different nature. Our estimates suggest that between 9%
and 15% of active Twitter accounts are bots. Characterizing ties among
accounts, we observe that simple bots tend to interact with bots that exhibit
more human-like behaviors. Analysis of content flows reveals retweet and
mention strategies adopted by bots to interact with different target groups.
Using clustering analysis, we characterize several subclasses of accounts,
including spammers, self promoters, and accounts that post content from
connected applications.Comment: Accepted paper for ICWSM'17, 10 pages, 8 figures, 1 tabl
Hidden Citations Obscure True Impact in Science
References, the mechanism scientists rely on to signal previous knowledge,
lately have turned into widely used and misused measures of scientific impact.
Yet, when a discovery becomes common knowledge, citations suffer from
obliteration by incorporation. This leads to the concept of hidden citation,
representing a clear textual credit to a discovery without a reference to the
publication embodying it. Here, we rely on unsupervised interpretable machine
learning applied to the full text of each paper to systematically identify
hidden citations. We find that for influential discoveries hidden citations
outnumber citation counts, emerging regardless of publishing venue and
discipline. We show that the prevalence of hidden citations is not driven by
citation counts, but rather by the degree of the discourse on the topic within
the text of the manuscripts, indicating that the more discussed is a discovery,
the less visible it is to standard bibliometric analysis. Hidden citations
indicate that bibliometric measures offer a limited perspective on quantifying
the true impact of a discovery, raising the need to extract knowledge from the
full text of the scientific corpus
- …