19,120 research outputs found
Latent Self-Exciting Point Process Model for Spatial-Temporal Networks
We propose a latent self-exciting point process model that describes
geographically distributed interactions between pairs of entities. In contrast
to most existing approaches that assume fully observable interactions, here we
consider a scenario where certain interaction events lack information about
participants. Instead, this information needs to be inferred from the available
observations. We develop an efficient approximate algorithm based on
variational expectation-maximization to infer unknown participants in an event
given the location and the time of the event. We validate the model on
synthetic as well as real-world data, and obtain very promising results on the
identity-inference task. We also use our model to predict the timing and
participants of future events, and demonstrate that it compares favorably with
baseline approaches.Comment: 20 pages, 6 figures (v3); 11 pages, 6 figures (v2); previous version
appeared in the 9th Bayesian Modeling Applications Workshop, UAI'1
A Survey of Location Prediction on Twitter
Locations, e.g., countries, states, cities, and point-of-interests, are
central to news, emergency events, and people's daily lives. Automatic
identification of locations associated with or mentioned in documents has been
explored for decades. As one of the most popular online social network
platforms, Twitter has attracted a large number of users who send millions of
tweets on daily basis. Due to the world-wide coverage of its users and
real-time freshness of tweets, location prediction on Twitter has gained
significant attention in recent years. Research efforts are spent on dealing
with new challenges and opportunities brought by the noisy, short, and
context-rich nature of tweets. In this survey, we aim at offering an overall
picture of location prediction on Twitter. Specifically, we concentrate on the
prediction of user home locations, tweet locations, and mentioned locations. We
first define the three tasks and review the evaluation metrics. By summarizing
Twitter network, tweet content, and tweet context as potential inputs, we then
structurally highlight how the problems depend on these inputs. Each dependency
is illustrated by a comprehensive review of the corresponding strategies
adopted in state-of-the-art approaches. In addition, we also briefly review two
related problems, i.e., semantic location prediction and point-of-interest
recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur
Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks
We propose a method for embedding two-dimensional locations in a continuous
vector space using a neural network-based model incorporating mixtures of
Gaussian distributions, presenting two model variants for text-based
geolocation and lexical dialectology. Evaluated over Twitter data, the proposed
model outperforms conventional regression-based geolocation and provides a
better estimate of uncertainty. We also show the effectiveness of the
representation for predicting words from location in lexical dialectology, and
evaluate it using the DARE dataset.Comment: Conference on Empirical Methods in Natural Language Processing (EMNLP
2017) September 2017, Copenhagen, Denmar
Bayesian Fused Lasso regression for dynamic binary networks
We propose a multinomial logistic regression model for link prediction in a
time series of directed binary networks. To account for the dynamic nature of
the data we employ a dynamic model for the model parameters that is strongly
connected with the fused lasso penalty. In addition to promoting sparseness,
this prior allows us to explore the presence of change points in the structure
of the network. We introduce fast computational algorithms for estimation and
prediction using both optimization and Bayesian approaches. The performance of
the model is illustrated using simulated data and data from a financial trading
network in the NYMEX natural gas futures market. Supplementary material
containing the trading network data set and code to implement the algorithms is
available online
Uncovering latent structure in valued graphs: A variational approach
As more and more network-structured data sets are available, the statistical
analysis of valued graphs has become common place. Looking for a latent
structure is one of the many strategies used to better understand the behavior
of a network. Several methods already exist for the binary case. We present a
model-based strategy to uncover groups of nodes in valued graphs. This
framework can be used for a wide span of parametric random graphs models and
allows to include covariates. Variational tools allow us to achieve approximate
maximum likelihood estimation of the parameters of these models. We provide a
simulation study showing that our estimation method performs well over a broad
range of situations. We apply this method to analyze host--parasite interaction
networks in forest ecosystems.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS361 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …