15 research outputs found
#greysanatomy vs. #yankees: Demographics and Hashtag Use on Twitter
Demographics, in particular, gender, age, and race, are a key predictor of
human behavior. Despite the significant effect that demographics plays, most
scientific studies using online social media do not consider this factor,
mainly due to the lack of such information. In this work, we use
state-of-the-art face analysis software to infer gender, age, and race from
profile images of 350K Twitter users from New York. For the period from
November 1, 2014 to October 31, 2015, we study which hashtags are used by
different demographic groups. Though we find considerable overlap for the most
popular hashtags, there are also many group-specific hashtags.Comment: This is a preprint of an article appearing at ICWSM 201
Language in Our Time: An Empirical Analysis of Hashtags
Hashtags in online social networks have gained tremendous popularity during
the past five years. The resulting large quantity of data has provided a new
lens into modern society. Previously, researchers mainly rely on data collected
from Twitter to study either a certain type of hashtags or a certain property
of hashtags. In this paper, we perform the first large-scale empirical analysis
of hashtags shared on Instagram, the major platform for hashtag-sharing. We
study hashtags from three different dimensions including the temporal-spatial
dimension, the semantic dimension, and the social dimension. Extensive
experiments performed on three large-scale datasets with more than 7 million
hashtags in total provide a series of interesting observations. First, we show
that the temporal patterns of hashtags can be categorized into four different
clusters, and people tend to share fewer hashtags at certain places and more
hashtags at others. Second, we observe that a non-negligible proportion of
hashtags exhibit large semantic displacement. We demonstrate hashtags that are
more uniformly shared among users, as quantified by the proposed hashtag
entropy, are less prone to semantic displacement. In the end, we propose a
bipartite graph embedding model to summarize users' hashtag profiles, and rely
on these profiles to perform friendship prediction. Evaluation results show
that our approach achieves an effective prediction with AUC (area under the ROC
curve) above 0.8 which demonstrates the strong social signals possessed in
hashtags.Comment: WWW 201
A Link Prediction Strategy for Personalized Tweet Recommendation through Doc2Vec Approach
Nowadays with growth of using Internet as a principle way of communication, likes different social medias channels (Twitter, Facebook, etc.) and also access to huge amount of information like News, there appear a main research subject to help users to find his/her interests among vast amount of relevant and irrelevant information. Recommender systems are helped to handle information overload problem and in this paper we introduce our Tweet Recommendation System that implement user’s Twitter information (Tweets, Retweet, Like,...) as a source of user’s information. In this work the semantic of tweets that regard as a User’s Explicit Interests (e.g., person, events, product mentioned in user’s tweets) are identified with the Doc2vec approach and recommend similar tweets through link-prediction strategy. The experiment results show that Doc2Vec approach is a better approach than the other previous approaches
Transnational hashtag protest movements and emancipatory politics in Africa: A three country study
This study explores three of sub Saharan Africa’s hashtag movements: Zimbabwe’s #ZimbabweanLivesMatter, Es watini’s # Es watiniLivesMatter and Nigeria’s #EndS ARS hashtags. Theore tically, we rely on the transnational alternative digital public sphere and hashtag activism to understand how social media acted as a meeting place for mobilization and building cross bounda- ry pollination and unitary movements. This investigation relied o n a combination of virtual ethnog- raphy and purposive sampling as methodological approaches. Thematic analysis was the analytical tool employed with four themes informing this investigation: democratisation and human rights, transnational solidarity, states ’ response to hashtag movements and use of parody accounts as a counter hegemonic strategy. The study found that these hashtags and movements achieved a modi- cum of ‘success’ by forcing some of Africa’s enduring dictatorships to make piecemeal concessions of varying degrees
Popularity and Geospatial Spread of Trends on Twitter: A Middle Eastern Case Study
Thousands of topics trend on Twitter across the world every day, making it increasingly challenging to provide real-time analysis of current issues, topics and themes being discussed across various locations and jurisdictions. There is thus a demand for simple and extensible approaches to provide deeper insight into these trends and how they propagate across locales. This paper represents one of the first studies to look at geospatial spread of trends on Twitter, presenting various techniques to provide increased understanding of how trends on social networks can spread across various regions and nations. It is based on a year-long data collection (N=2,307,163) and analysis between 2016–2017 of seven Middle Eastern countries (Bahrain, Egypt, Kuwait, Lebanon, Qatar, Saudi Arabia, and the United Arab Emirates). Using this year-long dataset, the project investigates the popularity and geospatial spread of trends, focusing on trend information but not processing individual topics, with the findings showing that likelihood of trends spreading to other locales is to a large extent influenced by the place in which it first appeared
Government management of the COVID-19 communication and public perception of the pandemic
The study presented here discusses public reception of the UK-wide government restrictions and regulations in relation to the COVID-19 pandemic, focusing on language use on Twitter to (1) track the prevalence of diverse opinions and changes in public perceptions and (2) reflect on clarity of official messaging. Our report relates to the four themes outlined as part of the Initial learning from the government’s response to the COVID-19 pandemic collated by the National Audit Office:
- transparency and public trust: providing transparent public-facing advice through clear and timely communication.
- data and evidence: monitoring public perception of government advice, identifying issues with public compliance and quantifying different types of behaviours/reactions (compliance, non-compliance, call for stricter measures), validating the effectiveness of interventions by systematically gathering and evaluating end-user feedback (comments from the public).
- coordination and delivery models: ensuring that public facing communication from government departments, central and local government, and public sector bodies is effectively coordinated and well-aligned.
- supporting and protecting people: understanding the pandemic’s impact on different groups and the risk of widening inequalities.
The report is based on the results of the UKRI/AHRC-funded TRAC:COVID project carried out at Birmingham City University. The first section draws on the dashboard created as part of the project, accessible online at https://traccovid.com. The dashboard is an open access tool based on 84,138,394 tweets related to coronavirus posted by users in the UK between 1st January 2020 and 30th April 2021. The tool helps explore how social media have been used in the UK during the pandemic to talk about COVID-19. Our analysis shows that throughout the pandemic there has been a widespread support for the main measures used to contain the COVID-19 virus outbreak. In fact, a considerable number of tweets supported the introduction of even stronger measures than those imposed by the government, and many criticised non-compliance as a sign of selfish behaviour. The results also indicate a presence of users who actively used terms related to conspiracy theories and, although these views were found to be in the minority, it is important not to underestimate the role they play in undermining the efforts to contain the pandemic.
The second part of the report reflects on the comprehensibility of official messages sent from government accounts and the accounts of public health bodies. The analysis shows a wide range of language-related problems, ranging from complex use of vocabulary and grammar and vague references to inaccurate information and potential exclusion of some of the intended recipients
Raumgeographische Verteilung von Twitter-Hashtags im deutschen Sprachraum
Diese Studie untersucht die räumliche Verteilung von Hashtags in einem Korpus deutschsprachiger Tweets unter Berücksichtigung dreier Arten von Nutzerstandortinformationen: exakter Standort, kodiert als Breitengrad-Längengrad-Koordinaten, ein „place“-Attribut, ausgewählt aus einer von Twitter geführten Liste von Orten, oder ein freier Eintrag im Nutzerprofil. Hashtags in Tweets mit exakten Ortsangaben weisen mit etwas höherer Wahrscheinlichkeit eine räumliche Konzentration auf als Hashtags mit Orts- oder Nutzerangaben, was möglicherweise auf die Verwendung von Mobilgeräten zur Veröffentlichung von Tweets zurückzuführen ist. Die Analyse der räumlichen Autokorrelation zeigt zwar, dass die meisten Hashtags keine starke räumliche Tendenz aufweisen, aber bei denjenigen, bei denen dies der Fall ist, handelt es sich meistens um Toponyme, Appellativa oder Eigennamen, die mit bestimmten Orten in Verbindung gebracht werden, wie eine auf Kartierung der Autokorrelationswerte veranschaulicht. Darüber hinaus beschreiben einige Hashtags, die eine räumliche Tendenz aufweisen, lokalisierte geografische oder meteorologische Phänomene.This study examines the spatial distribution of hashtags in a corpus of German-language tweets by considering three kinds of user location information: exact location encoded as latitude-longitude coordinates, a „place“ attribute selected from a Twitter-maintained list of places, or a free-form entry in the user profile. Hashtags in tweets with exact locations are slightly more likely to show spatial concentration, compared to hashtags with place or user location information, which may reflect the use of mobile devices to publish tweets. While spatial autocorrelation analysis shows that most hashtags do not exhibit a strong spatial tendency, those that do are likely to be toponyms, appellatives, or proper nouns associated with specific places, as can be shown by mapping autocorrelation values. In addition, some hashtags that exhibit a spatial tendency describe localized geographical or meteorological phenomena
SWAT: A System for Detecting Salient Wikipedia Entities in Texts
We study the problem of entity salience by proposing the design and
implementation of SWAT, a system that identifies the salient Wikipedia entities
occurring in an input document. SWAT consists of several modules that are able
to detect and classify on-the-fly Wikipedia entities as salient or not, based
on a large number of syntactic, semantic and latent features properly extracted
via a supervised process which has been trained over millions of examples drawn
from the New York Times corpus. The validation process is performed through a
large experimental assessment, eventually showing that SWAT improves known
solutions over all publicly available datasets. We release SWAT via an API that
we describe and comment in the paper in order to ease its use in other
software