10,569 research outputs found
A Survey of Location Prediction on Twitter
Locations, e.g., countries, states, cities, and point-of-interests, are
central to news, emergency events, and people's daily lives. Automatic
identification of locations associated with or mentioned in documents has been
explored for decades. As one of the most popular online social network
platforms, Twitter has attracted a large number of users who send millions of
tweets on daily basis. Due to the world-wide coverage of its users and
real-time freshness of tweets, location prediction on Twitter has gained
significant attention in recent years. Research efforts are spent on dealing
with new challenges and opportunities brought by the noisy, short, and
context-rich nature of tweets. In this survey, we aim at offering an overall
picture of location prediction on Twitter. Specifically, we concentrate on the
prediction of user home locations, tweet locations, and mentioned locations. We
first define the three tasks and review the evaluation metrics. By summarizing
Twitter network, tweet content, and tweet context as potential inputs, we then
structurally highlight how the problems depend on these inputs. Each dependency
is illustrated by a comprehensive review of the corresponding strategies
adopted in state-of-the-art approaches. In addition, we also briefly review two
related problems, i.e., semantic location prediction and point-of-interest
recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur
On the Accuracy of Hyper-local Geotagging of Social Media Content
Social media users share billions of items per year, only a small fraction of
which is geotagged. We present a data- driven approach for identifying
non-geotagged content items that can be associated with a hyper-local
geographic area by modeling the location distributions of hyper-local n-grams
that appear in the text. We explore the trade-off between accuracy, precision
and coverage of this method. Further, we explore differences across content
received from multiple platforms and devices, and show, for example, that
content shared via different sources and applications produces significantly
different geographic distributions, and that it is best to model and predict
location for items according to their source. Our findings show the potential
and the bounds of a data-driven approach to geotag short social media texts,
and offer implications for all applications that use data-driven approaches to
locate content.Comment: 10 page
Accurate Local Estimation of Geo-Coordinates for Social Media Posts
Associating geo-coordinates with the content of social media posts can
enhance many existing applications and services and enable a host of new ones.
Unfortunately, a majority of social media posts are not tagged with
geo-coordinates. Even when location data is available, it may be inaccurate,
very broad or sometimes fictitious. Contemporary location estimation approaches
based on analyzing the content of these posts can identify only broad areas
such as a city, which limits their usefulness. To address these shortcomings,
this paper proposes a methodology to narrowly estimate the geo-coordinates of
social media posts with high accuracy. The methodology relies solely on the
content of these posts and prior knowledge of the wide geographical region from
where the posts originate. An ensemble of language models, which are smoothed
over non-overlapping sub-regions of a wider region, lie at the heart of the
methodology. Experimental evaluation using a corpus of over half a million
tweets from New York City shows that the approach, on an average, estimates
locations of tweets to within just 2.15km of their actual positions.Comment: In Proceedings of the 26th International Conference on Software
Engineering and Knowledge Engineering, pp. 642 - 647, 201
Determine the User Country of a Tweet
In the widely used message platform Twitter, about 2% of the tweets contains
the geographical location through exact GPS coordinates (latitude and
longitude). Knowing the location of a tweet is useful for many data analytics
questions. This research is looking at the determination of a location for
tweets that do not contain GPS coordinates. An accuracy of 82% was achieved
using a Naive Bayes model trained on features such as the users' timezone, the
user's language, and the parsed user location. The classifier performs well on
active Twitter countries such as the Netherlands and United Kingdom. An
analysis of errors made by the classifier shows that mistakes were made due to
limited information and shared properties between countries such as shared
timezone. A feature analysis was performed in order to see the effect of
different features. The features timezone and parsed user location were the
most informative features.Comment: CTIT Technical Report, University of Twent
The ISIS Twitter census: defining and describing the population of ISIS supporters on Twitter
Presents a demographic snapshot of ISIS supporters on Twitter by analysing a sample of 20,000 ISIS-supporting Twitter accounts, mapping the locations, preferred languages, and the number and type of followers of these accounts.
Overview
Although much ink has been spilled on ISIS’s activity on Twitter, very basic questions about the group’s social media strategy remain unanswered. In a new analysis paper, J.M. Berger and Jonathon Morgan answer fundamental questions about how many Twitter users support ISIS, who and where they are, and how they participate in its highly organized online activities.
Previous analyses of ISIS’s Twitter reach have relied on limited segments of the overall ISIS social network. The small, cellular nature of that network—and the focus on particular subsets within the network such as foreign fighters—may create misleading conclusions. This information vacuum extends to discussions of how the West should respond to the group’s online campaigns.
Berger and Morgan present a demographic snapshot of ISIS supporters on Twitter by analyzing a sample of 20,000 ISIS-supporting Twitter accounts. Using a sophisticated and innovative methodology, the authors map the locations, preferred languages, and the number and type of followers of these accounts.
Among the key findings:
From September through December 2014, the authors estimate that at least 46,000 Twitter accounts were used by ISIS supporters, although not all of them were active at the same time.
Typical ISIS supporters were located within the organization’s territories in Syria and Iraq, as well as in regions contested by ISIS. Hundreds of ISIS-supporting accounts sent tweets with location metadata embedded.
Almost one in five ISIS supporters selected English as their primary language when using Twitter. Three quarters selected Arabic.
ISIS-supporting accounts had an average of about 1,000 followers each, considerably higher than an ordinary Twitter user. ISIS-supporting accounts were also considerably more active than non-supporting users.
A minimum of 1,000 ISIS-supporting accounts were suspended by Twitter between September and December 2014. Accounts that tweeted most often and had the most followers were most likely to be suspended.
Much of ISIS’s social media success can be attributed to a relatively small group of hyperactive users, numbering between 500 and 2,000 accounts, which tweet in concentrated bursts of high volume.
Based on their key findings, the authors recommend social media companies and the U.S government work together to devise appropriate responses to extremism on social media. Approaches to the problem of extremist use of social media, Berger and Morgan contend, are most likely to succeed when they are mainstreamed into wider dialogues among the broad range of community, private, and public stakeholders
- …