1,565 research outputs found
A Local-Global LDA Model for Discovering Geographical Topics from Social Media
Micro-blogging services can track users' geo-locations when users check-in
their places or use geo-tagging which implicitly reveals locations. This "geo
tracking" can help to find topics triggered by some events in certain regions.
However, discovering such topics is very challenging because of the large
amount of noisy messages (e.g. daily conversations). This paper proposes a
method to model geographical topics, which can filter out irrelevant words by
different weights in the local and global contexts. Our method is based on the
Latent Dirichlet Allocation (LDA) model but each word is generated from either
a local or a global topic distribution by its generation probabilities. We
evaluated our model with data collected from Weibo, which is currently the most
popular micro-blogging service for Chinese. The evaluation results demonstrate
that our method outperforms other baseline methods in several metrics such as
model perplexity, two kinds of entropies and KL-divergence of discovered
topics
A Location-Sentiment-Aware Recommender System for Both Home-Town and Out-of-Town Users
Spatial item recommendation has become an important means to help people
discover interesting locations, especially when people pay a visit to
unfamiliar regions. Some current researches are focusing on modelling
individual and collective geographical preferences for spatial item
recommendation based on users' check-in records, but they fail to explore the
phenomenon of user interest drift across geographical regions, i.e., users
would show different interests when they travel to different regions. Besides,
they ignore the influence of public comments for subsequent users' check-in
behaviors. Specifically, it is intuitive that users would refuse to check in to
a spatial item whose historical reviews seem negative overall, even though it
might fit their interests. Therefore, it is necessary to recommend the right
item to the right user at the right location. In this paper, we propose a
latent probabilistic generative model called LSARS to mimic the decision-making
process of users' check-in activities both in home-town and out-of-town
scenarios by adapting to user interest drift and crowd sentiments, which can
learn location-aware and sentiment-aware individual interests from the contents
of spatial items and user reviews. Due to the sparsity of user activities in
out-of-town regions, LSARS is further designed to incorporate the public
preferences learned from local users' check-in behaviors. Finally, we deploy
LSARS into two practical application scenes: spatial item recommendation and
target user discovery. Extensive experiments on two large-scale location-based
social networks (LBSNs) datasets show that LSARS achieves better performance
than existing state-of-the-art methods.Comment: Accepted by KDD 201
Hierarchical relational models for document networks
We develop the relational topic model (RTM), a hierarchical model of both
network structure and node attributes. We focus on document networks, where the
attributes of each document are its words, that is, discrete observations taken
from a fixed vocabulary. For each pair of documents, the RTM models their link
as a binary random variable that is conditioned on their contents. The model
can be used to summarize a network of documents, predict links between them,
and predict words within them. We derive efficient inference and estimation
algorithms based on variational methods that take advantage of sparsity and
scale with the number of links. We evaluate the predictive performance of the
RTM for large networks of scientific abstracts, web documents, and
geographically tagged news.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS309 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Event detection in location-based social networks
With the advent of social networks and the rise of mobile technologies, users have become ubiquitous sensors capable of monitoring various real-world events in a crowd-sourced manner. Location-based social networks have proven to be faster than traditional media channels in reporting and geo-locating breaking news, i.e. Osama Bin Laden’s death was first confirmed on Twitter even before the announcement from the communication department at the White House. However, the deluge of user-generated data on these networks requires intelligent systems capable of identifying and characterizing such events in a comprehensive manner. The data mining community coined the term, event detection , to refer to the task of uncovering emerging patterns in data streams . Nonetheless, most data mining techniques do not reproduce the underlying data generation process, hampering to self-adapt in fast-changing scenarios. Because of this, we propose a probabilistic machine learning approach to event detection which explicitly models the data generation process and enables reasoning about the discovered events. With the aim to set forth the differences between both approaches, we present two techniques for the problem of event detection in Twitter : a data mining technique called Tweet-SCAN and a machine learning technique called Warble. We assess and compare both techniques in a dataset of tweets geo-located in the city of Barcelona during its annual festivities. Last but not least, we present the algorithmic changes and data processing frameworks to scale up the proposed techniques to big data workloads.This work is partially supported by Obra Social “la Caixa”, by the Spanish Ministry of Science and Innovation under contract (TIN2015-65316), by the Severo Ochoa Program (SEV2015-0493), by SGR programs of the Catalan Government (2014-SGR-1051, 2014-SGR-118), Collectiveware (TIN2015-66863-C2-1-R) and BSC/UPC NVIDIA GPU Center of Excellence.We would also like to thank the reviewers for their constructive feedback.Peer ReviewedPostprint (author's final draft
- …