Search CORE

103 research outputs found

Connecting Dream Networks Across Cultures

Author: Menczer Filippo
Varol Onur
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Many species dream, yet there remain many open research questions in the study of dreams. The symbolism of dreams and their interpretation is present in cultures throughout history. Analysis of online data sources for dream interpretation using network science leads to understanding symbolism in dreams and their associated meaning. In this study, we introduce dream interpretation networks for English, Chinese and Arabic that represent different cultures from various parts of the world. We analyze communities in these networks, finding that symbols within a community are semantically related. The central nodes in communities give insight about cultures and symbols in dreams. The community structure of different networks highlights cultural similarities and differences. Interconnections between different networks are also identified by translating symbols from different languages into English. Structural correlations across networks point out relationships between cultures. Similarities between network communities are also investigated by analysis of sentiment in symbol interpretations. We find that interpretations within a community tend to have similar sentiment. Furthermore, we cluster communities based on their sentiment, yielding three main categories of positive, negative, and neutral dream symbols.Comment: 6 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Should we agree to disagree about Twitter's bot problem?

Author: Varol Onur
Publication venue
Publication date: 05/11/2022
Field of study

Bots, simply defined as accounts controlled by automation, can be used as a weapon for online manipulation and pose a threat to the health of platforms. Researchers have studied online platforms to detect, estimate, and characterize bot accounts. Concerns about the prevalence of bots were raised following Elon Musk's bid to acquire Twitter. Twitter's recent estimate that 5\% of monetizable daily active users being bot accounts raised questions about their methodology. This estimate is based on a specific number of active users and relies on Twitter's criteria for bot accounts. In this work, we want to stress that crucial questions need to be answered in order to make a proper estimation and compare different methodologies. We argue how assumptions on bot-likely behavior, the detection approach, and the population inspected can affect the estimation of the percentage of bots on Twitter. Finally, we emphasize the responsibility of platforms to be vigilant, transparent, and unbiased in dealing with threats that may affect their users.Comment: 22 pages, 5 figure

arXiv.org e-Print Archive

Sabanci University Research Database

Traveling Trends: Social Butterflies or Frequent Fliers?

Author: Ferrara Emilio
Flammini Alessandro
Menczer Filippo
Varol Onur
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

Trending topics are the online conversations that grab collective attention on social media. They are continually changing and often reflect exogenous events that happen in the real world. Trends are localized in space and time as they are driven by activity in specific geographic areas that act as sources of traffic and information flow. Taken independently, trends and geography have been discussed in recent literature on online social media; although, so far, little has been done to characterize the relation between trends and geography. Here we investigate more than eleven thousand topics that trended on Twitter in 63 main US locations during a period of 50 days in 2013. This data allows us to study the origins and pathways of trends, how they compete for popularity at the local level to emerge as winners at the country level, and what dynamics underlie their production and consumption in different geographic areas. We identify two main classes of trending topics: those that surface locally, coinciding with three different geographic clusters (East coast, Midwest and Southwest); and those that emerge globally from several metropolitan areas, coinciding with the major air traffic hubs of the country. These hubs act as trendsetters, generating topics that eventually trend at the country level, and driving the conversation across the country. This poses an intriguing conjecture, drawing a parallel between the spread of information and diseases: Do trends travel faster by airplane than over the Internet?Comment: Proceedings of the first ACM conference on Online social networks, pp. 213-222, 201

arXiv.org e-Print Archive

Crossref

Unsupervised detection of coordinated fake-follower campaigns on social media

Author: Varol Onur
Zouzou Yasser
Publication venue
Publication date: 31/10/2023
Field of study

Automated social media accounts, known as bots, are increasingly recognized as key tools for manipulative online activities. These activities can stem from coordination among several accounts and these automated campaigns can manipulate social network structure by following other accounts, amplifying their content, and posting messages to spam online discourse. In this study, we present a novel unsupervised detection method designed to target a specific category of malicious accounts designed to manipulate user metrics such as online popularity. Our framework identifies anomalous following patterns among all the followers of a social media account. Through the analysis of a large number of accounts on the Twitter platform (rebranded as Twitter after the acquisition of Elon Musk), we demonstrate that irregular following patterns are prevalent and are indicative of automated fake accounts. Notably, we find that these detected groups of anomalous followers exhibit consistent behavior across multiple accounts. This observation, combined with the computational efficiency of our proposed approach, makes it a valuable tool for investigating large-scale coordinated manipulation campaigns on social media platforms.Comment: 17 pages, 5 figures, 1 table and supplementary informatio

arXiv.org e-Print Archive

TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis

Author: Najafi Ali
Varol Onur
Publication venue
Publication date: 29/11/2023
Field of study

Turkish is one of the most popular languages in the world. Wide us of this language on social media platforms such as Twitter, Instagram, or Tiktok and strategic position of the country in the world politics makes it appealing for the social network researchers and industry. To address this need, we introduce TurkishBERTweet, the first large scale pre-trained language model for Turkish social media built using almost 900 million tweets. The model shares the same architecture as base BERT model with smaller input length, making TurkishBERTweet lighter than BERTurk and can have significantly lower inference time. We trained our model using the same approach for RoBERTa model and evaluated on two text classification tasks: Sentiment Classification and Hate Speech Detection. We demonstrate that TurkishBERTweet outperforms the other available alternatives on generalizability and its lower inference time gives significant advantage to process large-scale datasets. We also compared our models with the commercial OpenAI solutions in terms of cost and performance to demonstrate TurkishBERTweet is scalable and cost-effective solution. As part of our research, we released TurkishBERTweet and fine-tuned LoRA adapters for the mentioned tasks under the MIT License to facilitate future research and applications on Turkish social media. Our TurkishBERTweet model is available at: https://github.com/ViralLab/TurkishBERTweetComment: 21 pages, 4 figures, 8 table

arXiv.org e-Print Archive

Evolution of Online User Behavior During a Social Upheaval

Author: Ferrara Emilio
Flammini Alessandro
Menczer Filippo
Ogan Christine L.
Varol Onur
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Social media represent powerful tools of mass communication and information diffusion. They played a pivotal role during recent social uprisings and political mobilizations across the world. Here we present a study of the Gezi Park movement in Turkey through the lens of Twitter. We analyze over 2.3 million tweets produced during the 25 days of protest occurred between May and June 2013. We first characterize the spatio-temporal nature of the conversation about the Gezi Park demonstrations, showing that similarity in trends of discussion mirrors geographic cues. We then describe the characteristics of the users involved in this conversation and what roles they played. We study how roles and individual influence evolved during the period of the upheaval. This analysis reveals that the conversation becomes more democratic as events unfold, with a redistribution of influence over time in the user population. We conclude by observing how the online and offline worlds are tightly intertwined, showing that exogenous events, such as political speeches or police actions, affect social media conversations and trigger changes in individual behavior.Comment: Best Paper Award at ACM Web Science 201

arXiv.org e-Print Archive

Crossref

Online Human-Bot Interactions: Detection, Estimation, and Characterization

Author: Davis Clayton A.
Ferrara Emilio
Flammini Alessandro
Menczer Filippo
Varol Onur
Publication venue
Publication date: 27/03/2017
Field of study

Increasing evidence suggests that a growing amount of social media content is generated by autonomous entities known as social bots. In this work we present a framework to detect such entities on Twitter. We leverage more than a thousand features extracted from public data and meta-data about users: friends, tweet content and sentiment, network patterns, and activity time series. We benchmark the classification framework by using a publicly available dataset of Twitter bots. This training data is enriched by a manually annotated collection of active Twitter users that include both humans and bots of varying sophistication. Our models yield high accuracy and agreement with each other and can detect bots of different nature. Our estimates suggest that between 9% and 15% of active Twitter accounts are bots. Characterizing ties among accounts, we observe that simple bots tend to interact with bots that exhibit more human-like behaviors. Analysis of content flows reveals retweet and mention strategies adopted by bots to interact with different target groups. Using clustering analysis, we characterize several subclasses of accounts, including spammers, self promoters, and accounts that post content from connected applications.Comment: Accepted paper for ICWSM'17, 10 pages, 8 figures, 1 tabl

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Hidden Citations Obscure True Impact in Science

Author: Barabási Albert-László
Meng Xiangyi
Varol Onur
Publication venue
Publication date: 24/10/2023
Field of study

References, the mechanism scientists rely on to signal previous knowledge, lately have turned into widely used and misused measures of scientific impact. Yet, when a discovery becomes common knowledge, citations suffer from obliteration by incorporation. This leads to the concept of hidden citation, representing a clear textual credit to a discovery without a reference to the publication embodying it. Here, we rely on unsupervised interpretable machine learning applied to the full text of each paper to systematically identify hidden citations. We find that for influential discoveries hidden citations outnumber citation counts, emerging regardless of publishing venue and discipline. We show that the prevalence of hidden citations is not driven by citation counts, but rather by the degree of the discourse on the topic within the text of the manuscripts, indicating that the more discussed is a discovery, the less visible it is to standard bibliometric analysis. Hidden citations indicate that bibliometric measures offer a limited perspective on quantifying the true impact of a discovery, raising the need to extract knowledge from the full text of the scientific corpus

arXiv.org e-Print Archive