10,869 research outputs found
Efficient Non-parametric Bayesian Hawkes Processes
In this paper, we develop an efficient nonparametric Bayesian estimation of
the kernel function of Hawkes processes. The non-parametric Bayesian approach
is important because it provides flexible Hawkes kernels and quantifies their
uncertainty. Our method is based on the cluster representation of Hawkes
processes. Utilizing the stationarity of the Hawkes process, we efficiently
sample random branching structures and thus, we split the Hawkes process into
clusters of Poisson processes. We derive two algorithms -- a block Gibbs
sampler and a maximum a posteriori estimator based on expectation maximization
-- and we show that our methods have a linear time complexity, both
theoretically and empirically. On synthetic data, we show our methods to be
able to infer flexible Hawkes triggering kernels. On two large-scale Twitter
diffusion datasets, we show that our methods outperform the current
state-of-the-art in goodness-of-fit and that the time complexity is linear in
the size of the dataset. We also observe that on diffusions related to online
videos, the learned kernels reflect the perceived longevity for different
content types such as music or pets videos
Modeling for seasonal marked point processes: An analysis of evolving hurricane occurrences
Seasonal point processes refer to stochastic models for random events which
are only observed in a given season. We develop nonparametric Bayesian
methodology to study the dynamic evolution of a seasonal marked point process
intensity. We assume the point process is a nonhomogeneous Poisson process and
propose a nonparametric mixture of beta densities to model dynamically evolving
temporal Poisson process intensities. Dependence structure is built through a
dependent Dirichlet process prior for the seasonally-varying mixing
distributions. We extend the nonparametric model to incorporate time-varying
marks, resulting in flexible inference for both the seasonal point process
intensity and for the conditional mark distribution. The motivating application
involves the analysis of hurricane landfalls with reported damages along the
U.S. Gulf and Atlantic coasts from 1900 to 2010. We focus on studying the
evolution of the intensity of the process of hurricane landfall occurrences,
and the respective maximum wind speed and associated damages. Our results
indicate an increase in the number of hurricane landfall occurrences and a
decrease in the median maximum wind speed at the peak of the season.
Introducing standardized damage as a mark, such that reported damages are
comparable both in time and space, we find that there is no significant rising
trend in hurricane damages over time.Comment: Published at http://dx.doi.org/10.1214/14-AOAS796 in the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Fast and scalable non-parametric Bayesian inference for Poisson point processes
We study the problem of non-parametric Bayesian estimation of the intensity
function of a Poisson point process. The observations are independent
realisations of a Poisson point process on the interval . We propose two
related approaches. In both approaches we model the intensity function as
piecewise constant on bins forming a partition of the interval . In
the first approach the coefficients of the intensity function are assigned
independent gamma priors, leading to a closed form posterior distribution. On
the theoretical side, we prove that as the posterior
asymptotically concentrates around the "true", data-generating intensity
function at an optimal rate for -H\"older regular intensity functions (). In the second approach we employ a gamma Markov chain prior on the
coefficients of the intensity function. The posterior distribution is no longer
available in closed form, but inference can be performed using a
straightforward version of the Gibbs sampler. Both approaches scale well with
sample size, but the second is much less sensitive to the choice of .
Practical performance of our methods is first demonstrated via synthetic data
examples. We compare our second method with other existing approaches on the UK
coal mining disasters data. Furthermore, we apply it to the US mass shootings
data and Donald Trump's Twitter data.Comment: 45 pages, 22 figure
- …