91,309 research outputs found
A Local-Global LDA Model for Discovering Geographical Topics from Social Media
Micro-blogging services can track users' geo-locations when users check-in
their places or use geo-tagging which implicitly reveals locations. This "geo
tracking" can help to find topics triggered by some events in certain regions.
However, discovering such topics is very challenging because of the large
amount of noisy messages (e.g. daily conversations). This paper proposes a
method to model geographical topics, which can filter out irrelevant words by
different weights in the local and global contexts. Our method is based on the
Latent Dirichlet Allocation (LDA) model but each word is generated from either
a local or a global topic distribution by its generation probabilities. We
evaluated our model with data collected from Weibo, which is currently the most
popular micro-blogging service for Chinese. The evaluation results demonstrate
that our method outperforms other baseline methods in several metrics such as
model perplexity, two kinds of entropies and KL-divergence of discovered
topics
Geo-Information Harvesting from Social Media Data
As unconventional sources of geo-information, massive imagery and text
messages from open platforms and social media form a temporally quasi-seamless,
spatially multi-perspective stream, but with unknown and diverse quality. Due
to its complementarity to remote sensing data, geo-information from these
sources offers promising perspectives, but harvesting is not trivial due to its
data characteristics. In this article, we address key aspects in the field,
including data availability, analysis-ready data preparation and data
management, geo-information extraction from social media text messages and
images, and the fusion of social media and remote sensing data. We then
showcase some exemplary geographic applications. In addition, we present the
first extensive discussion of ethical considerations of social media data in
the context of geo-information harvesting and geographic applications. With
this effort, we wish to stimulate curiosity and lay the groundwork for
researchers who intend to explore social media data for geo-applications. We
encourage the community to join forces by sharing their code and data.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
A Roadmap for Citizen Science in GEO - The essence of the Lisbon Declaration. WeObserve policy brief 1
The relevance of Citizen Science and Citizen Observatories has only recently been considered in GEO activities. In order to advocate its importance and significance, this policy brief summarises three key messages from the Lisbon Declaration for European policy makers and describes how best to connect and integrate Citizen Science communities as well as their activities and outputs into GEO
Using Robust PCA to estimate regional characteristics of language use from geo-tagged Twitter messages
Principal component analysis (PCA) and related techniques have been
successfully employed in natural language processing. Text mining applications
in the age of the online social media (OSM) face new challenges due to
properties specific to these use cases (e.g. spelling issues specific to texts
posted by users, the presence of spammers and bots, service announcements,
etc.). In this paper, we employ a Robust PCA technique to separate typical
outliers and highly localized topics from the low-dimensional structure present
in language use in online social networks. Our focus is on identifying
geospatial features among the messages posted by the users of the Twitter
microblogging service. Using a dataset which consists of over 200 million
geolocated tweets collected over the course of a year, we investigate whether
the information present in word usage frequencies can be used to identify
regional features of language use and topics of interest. Using the PCA pursuit
method, we are able to identify important low-dimensional features, which
constitute smoothly varying functions of the geographic location
User-driven geo-temporal density-based exploration of periodic and not periodic events reported in social networks
International audienceIn this paper we propose a procedure consisting of a first collection phase of social net- work messages, a subsequent user query selection, and finally a clustering phase, de- fined by extending the density-based DBSCAN algorithm, for performing a geographic and temporal exploration of a collection of items, in order to reveal and map their latent spatio-temporal structure. Specifically, both several geo-temporal distance measures and a density-based geo-temporal clustering algorithm are proposed. The approach can be applied to social messages containing an explicit geographic and temporal location. The algorithm usage is exemplified to identify geographic regions where many geotagged Twitter messages about an event of interest have been created, possibly in the same time period in the case of non-periodic events (aperiodic events), or at regular timestamps in the case of periodic events. This allows discovering the spatio-temporal periodic and aperiodic characteristics of events occurring in specific geographic areas, and thus increasing the awareness of decision makers who are in charge of territorial planning. Several case studies are used to illustrate the proposed procedure
- …