338 research outputs found
A Survey of Location Prediction on Twitter
Locations, e.g., countries, states, cities, and point-of-interests, are
central to news, emergency events, and people's daily lives. Automatic
identification of locations associated with or mentioned in documents has been
explored for decades. As one of the most popular online social network
platforms, Twitter has attracted a large number of users who send millions of
tweets on daily basis. Due to the world-wide coverage of its users and
real-time freshness of tweets, location prediction on Twitter has gained
significant attention in recent years. Research efforts are spent on dealing
with new challenges and opportunities brought by the noisy, short, and
context-rich nature of tweets. In this survey, we aim at offering an overall
picture of location prediction on Twitter. Specifically, we concentrate on the
prediction of user home locations, tweet locations, and mentioned locations. We
first define the three tasks and review the evaluation metrics. By summarizing
Twitter network, tweet content, and tweet context as potential inputs, we then
structurally highlight how the problems depend on these inputs. Each dependency
is illustrated by a comprehensive review of the corresponding strategies
adopted in state-of-the-art approaches. In addition, we also briefly review two
related problems, i.e., semantic location prediction and point-of-interest
recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur
Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks
We propose a method for embedding two-dimensional locations in a continuous
vector space using a neural network-based model incorporating mixtures of
Gaussian distributions, presenting two model variants for text-based
geolocation and lexical dialectology. Evaluated over Twitter data, the proposed
model outperforms conventional regression-based geolocation and provides a
better estimate of uncertainty. We also show the effectiveness of the
representation for predicting words from location in lexical dialectology, and
evaluate it using the DARE dataset.Comment: Conference on Empirical Methods in Natural Language Processing (EMNLP
2017) September 2017, Copenhagen, Denmar
City-level Geolocation of Tweets for Real-time Visual Analytics
Real-time tweets can provide useful information on evolving events and
situations. Geotagged tweets are especially useful, as they indicate the
location of origin and provide geographic context. However, only a small
portion of tweets are geotagged, limiting their use for situational awareness.
In this paper, we adapt, improve, and evaluate a state-of-the-art deep learning
model for city-level geolocation prediction, and integrate it with a visual
analytics system tailored for real-time situational awareness. We provide
computational evaluations to demonstrate the superiority and utility of our
geolocation prediction model within an interactive system.Comment: 4 pages, 2 tables, 1 figure, SIGSPATIAL GeoAI Worksho
A Transformer-based Framework for POI-level Social Post Geolocation
POI-level geo-information of social posts is critical to many location-based
applications and services. However, the multi-modality, complexity and diverse
nature of social media data and their platforms limit the performance of
inferring such fine-grained locations and their subsequent applications. To
address this issue, we present a transformer-based general framework, which
builds upon pre-trained language models and considers non-textual data, for
social post geolocation at the POI level. To this end, inputs are categorized
to handle different social data, and an optimal combination strategy is
provided for feature representations. Moreover, a uniform representation of
hierarchy is proposed to learn temporal information, and a concatenated version
of encodings is employed to capture feature-wise positions better. Experimental
results on various social datasets demonstrate that three variants of our
proposed framework outperform multiple state-of-art baselines by a large margin
in terms of accuracy and distance error metrics.Comment: Full papers are 12 pages in length plus additional 4 pages for
references (turns to 18 pages in total after submitting to arxiv). One figure
and 5 tables are contained. This paper was submitted to ECIR 2023 for revie
- …