1,347 research outputs found
The two-sample problem for Poisson processes: adaptive tests with a non-asymptotic wild bootstrap approach
Considering two independent Poisson processes, we address the question of
testing equality of their respective intensities. We first propose single tests
whose test statistics are U-statistics based on general kernel functions. The
corresponding critical values are constructed from a non-asymptotic wild
bootstrap approach, leading to level \alpha tests. Various choices for the
kernel functions are possible, including projection, approximation or
reproducing kernels. In this last case, we obtain a parametric rate of testing
for a weak metric defined in the RKHS associated with the considered
reproducing kernel. Then we introduce, in the other cases, an aggregation
procedure, which allows us to import ideas coming from model selection,
thresholding and/or approximation kernels adaptive estimation. The resulting
multiple tests are proved to be of level \alpha, and to satisfy non-asymptotic
oracle type conditions for the classical L2-norm. From these conditions, we
deduce that they are adaptive in the minimax sense over a large variety of
classes of alternatives based on classical and weak Besov bodies in the
univariate case, but also Sobolev and anisotropic Nikol'skii-Besov balls in the
multivariate case
Efficient Non-parametric Bayesian Hawkes Processes
In this paper, we develop an efficient nonparametric Bayesian estimation of
the kernel function of Hawkes processes. The non-parametric Bayesian approach
is important because it provides flexible Hawkes kernels and quantifies their
uncertainty. Our method is based on the cluster representation of Hawkes
processes. Utilizing the stationarity of the Hawkes process, we efficiently
sample random branching structures and thus, we split the Hawkes process into
clusters of Poisson processes. We derive two algorithms -- a block Gibbs
sampler and a maximum a posteriori estimator based on expectation maximization
-- and we show that our methods have a linear time complexity, both
theoretically and empirically. On synthetic data, we show our methods to be
able to infer flexible Hawkes triggering kernels. On two large-scale Twitter
diffusion datasets, we show that our methods outperform the current
state-of-the-art in goodness-of-fit and that the time complexity is linear in
the size of the dataset. We also observe that on diffusions related to online
videos, the learned kernels reflect the perceived longevity for different
content types such as music or pets videos
Nonparametric likelihood based estimation of linear filters for point processes
We consider models for multivariate point processes where the intensity is
given nonparametrically in terms of functions in a reproducing kernel Hilbert
space. The likelihood function involves a time integral and is consequently not
given in terms of a finite number of kernel evaluations. The main result is a
representation of the gradient of the log-likelihood, which we use to derive
computable approximations of the log-likelihood and the gradient by time
discretization. These approximations are then used to minimize the approximate
penalized log-likelihood. For time and memory efficiency the implementation
relies crucially on the use of sparse matrices. As an illustration we consider
neuron network modeling, and we use this example to investigate how the
computational costs of the approximations depend on the resolution of the time
discretization. The implementation is available in the R package ppstat.Comment: 10 pages, 3 figure
Scalable high-resolution forecasting of sparse spatiotemporal events with kernel methods: a winning solution to the NIJ "Real-Time Crime Forecasting Challenge"
We propose a generic spatiotemporal event forecasting method, which we developed for the National Institute of Justice’s (NIJ) RealTime Crime Forecasting Challenge (National Institute of Justice, 2017). Our method is a spatiotemporal forecasting model combining scalable randomized Reproducing Kernel Hilbert Space (RKHS) methods for approximating Gaussian processes with autoregressive smoothing kernels in a regularized supervised learning framework. While the smoothing kernels capture the two main approaches in current use in the field of crime forecasting, kernel density estimation (KDE) and self-exciting point process (SEPP) models, the RKHS component of the model can be understood as an approximation to the popular log-Gaussian Cox Process model. For inference, we discretize the spatiotemporal point pattern and learn a log-intensity function using the Poisson likelihood and highly efficient gradientbased optimization methods. Model hyperparameters including quality of RKHS approximation, spatial and temporal kernel lengthscales, number of autoregressive lags, bandwidths for smoothing kernels, as well as cell shape, size, and rotation, were learned using crossvalidation. Resulting predictions significantly exceeded baseline KDE estimates and SEPP models for sparse events
Generalizations of Ripley's K-function with Application to Space Curves
The intensity function and Ripley's K-function have been used extensively in
the literature to describe the first and second moment structure of spatial
point sets. This has many applications including describing the statistical
structure of synaptic vesicles. Some attempts have been made to extend Ripley's
K-function to curve pieces. Such an extension can be used to describe the
statistical structure of muscle fibers and brain fiber tracks. In this paper,
we take a computational perspective and construct new and very general variants
of Ripley's K-function for curves pieces, surface patches etc. We discuss the
method from [Chiu, Stoyan, Kendall, & Mecke 2013] and compare it with our
generalizations theoretically, and we give examples demonstrating the
difference in their ability to separate sets of curve pieces.Comment: 9 pages & 8 figure
- …