10,869 research outputs found

    Efficient Non-parametric Bayesian Hawkes Processes

    Full text link
    In this paper, we develop an efficient nonparametric Bayesian estimation of the kernel function of Hawkes processes. The non-parametric Bayesian approach is important because it provides flexible Hawkes kernels and quantifies their uncertainty. Our method is based on the cluster representation of Hawkes processes. Utilizing the stationarity of the Hawkes process, we efficiently sample random branching structures and thus, we split the Hawkes process into clusters of Poisson processes. We derive two algorithms -- a block Gibbs sampler and a maximum a posteriori estimator based on expectation maximization -- and we show that our methods have a linear time complexity, both theoretically and empirically. On synthetic data, we show our methods to be able to infer flexible Hawkes triggering kernels. On two large-scale Twitter diffusion datasets, we show that our methods outperform the current state-of-the-art in goodness-of-fit and that the time complexity is linear in the size of the dataset. We also observe that on diffusions related to online videos, the learned kernels reflect the perceived longevity for different content types such as music or pets videos

    Modeling for seasonal marked point processes: An analysis of evolving hurricane occurrences

    Full text link
    Seasonal point processes refer to stochastic models for random events which are only observed in a given season. We develop nonparametric Bayesian methodology to study the dynamic evolution of a seasonal marked point process intensity. We assume the point process is a nonhomogeneous Poisson process and propose a nonparametric mixture of beta densities to model dynamically evolving temporal Poisson process intensities. Dependence structure is built through a dependent Dirichlet process prior for the seasonally-varying mixing distributions. We extend the nonparametric model to incorporate time-varying marks, resulting in flexible inference for both the seasonal point process intensity and for the conditional mark distribution. The motivating application involves the analysis of hurricane landfalls with reported damages along the U.S. Gulf and Atlantic coasts from 1900 to 2010. We focus on studying the evolution of the intensity of the process of hurricane landfall occurrences, and the respective maximum wind speed and associated damages. Our results indicate an increase in the number of hurricane landfall occurrences and a decrease in the median maximum wind speed at the peak of the season. Introducing standardized damage as a mark, such that reported damages are comparable both in time and space, we find that there is no significant rising trend in hurricane damages over time.Comment: Published at http://dx.doi.org/10.1214/14-AOAS796 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Fast and scalable non-parametric Bayesian inference for Poisson point processes

    Get PDF
    We study the problem of non-parametric Bayesian estimation of the intensity function of a Poisson point process. The observations are nn independent realisations of a Poisson point process on the interval [0,T][0,T]. We propose two related approaches. In both approaches we model the intensity function as piecewise constant on NN bins forming a partition of the interval [0,T][0,T]. In the first approach the coefficients of the intensity function are assigned independent gamma priors, leading to a closed form posterior distribution. On the theoretical side, we prove that as n→∞,n\rightarrow\infty, the posterior asymptotically concentrates around the "true", data-generating intensity function at an optimal rate for hh-H\"older regular intensity functions (0<h≤10 < h\leq 1). In the second approach we employ a gamma Markov chain prior on the coefficients of the intensity function. The posterior distribution is no longer available in closed form, but inference can be performed using a straightforward version of the Gibbs sampler. Both approaches scale well with sample size, but the second is much less sensitive to the choice of NN. Practical performance of our methods is first demonstrated via synthetic data examples. We compare our second method with other existing approaches on the UK coal mining disasters data. Furthermore, we apply it to the US mass shootings data and Donald Trump's Twitter data.Comment: 45 pages, 22 figure
    • …
    corecore