886 research outputs found
Modeling and estimation of multi-source clustering in crime and security data
While the presence of clustering in crime and security event data is well
established, the mechanism(s) by which clustering arises is not fully
understood. Both contagion models and history independent correlation models
are applied, but not simultaneously. In an attempt to disentangle contagion
from other types of correlation, we consider a Hawkes process with background
rate driven by a log Gaussian Cox process. Our inference methodology is an
efficient Metropolis adjusted Langevin algorithm for filtering of the intensity
and estimation of the model parameters. We apply the methodology to property
and violent crime data from Chicago, terrorist attack data from Northern
Ireland and Israel, and civilian casualty data from Iraq. For each data set we
quantify the uncertainty in the levels of contagion vs. history independent
correlation.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS647 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
The EM Algorithm and the Rise of Computational Biology
In the past decade computational biology has grown from a cottage industry
with a handful of researchers to an attractive interdisciplinary field,
catching the attention and imagination of many quantitatively-minded
scientists. Of interest to us is the key role played by the EM algorithm during
this transformation. We survey the use of the EM algorithm in a few important
computational biology problems surrounding the "central dogma"; of molecular
biology: from DNA to RNA and then to proteins. Topics of this article include
sequence motif discovery, protein sequence alignment, population genetics,
evolutionary models and mRNA expression microarray data analysis.Comment: Published in at http://dx.doi.org/10.1214/09-STS312 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
The EM Algorithm in Genetics, Genomics and Public Health
The popularity of the EM algorithm owes much to the 1977 paper by Dempster,
Laird and Rubin. That paper gave the algorithm its name, identified the general
form and some key properties of the algorithm and established its broad
applicability in scientific research. This review gives a nontechnical
introduction to the algorithm for a general scientific audience, and presents a
few examples characteristic of its application.Comment: Published in at http://dx.doi.org/10.1214/08-STS270 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Nonparametric ridge estimation
We study the problem of estimating the ridges of a density function. Ridge
estimation is an extension of mode finding and is useful for understanding the
structure of a density. It can also be used to find hidden structure in point
cloud data. We show that, under mild regularity conditions, the ridges of the
kernel density estimator consistently estimate the ridges of the true density.
When the data are noisy measurements of a manifold, we show that the ridges are
close and topologically similar to the hidden manifold. To find the estimated
ridges in practice, we adapt the modified mean-shift algorithm proposed by
Ozertem and Erdogmus [J. Mach. Learn. Res. 12 (2011) 1249-1286]. Some numerical
experiments verify that the algorithm is accurate.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1218 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Baryonic Effects on Lagrangian Clustering and Angular Momentum Reconstruction
Recent studies illustrate the correlation between the angular momenta of cosmic structures and their Lagrangian properties. However, only baryons are observable and it is unclear whether they reliably trace the cosmic angular momenta. We study the Lagrangian mass distribution, spin correlation, and predictability of dark matter, gas, and stellar components of galaxy-halo systems using IllustrisTNG, and show that the primordial segregations between components are typically small. Their protoshapes are also similar in terms of the statistics of moment of inertia tensors. Under the common gravitational potential they are expected to exert the same tidal torque and the strong spin correlations are not destroyed by the nonlinear evolution and complicated baryonic effects, as confirmed by the high-resolution hydrodynamic simulations. We further show that their late-time angular momenta traced by total gas, stars, or the central galaxies, can be reliably reconstructed by the initial perturbations. These results suggest that baryonic angular momenta can potentially be used in reconstructing the parameters and models related to the initial perturbations.Peer reviewe
A stochastic algorithm for probabilistic independent component analysis
The decomposition of a sample of images on a relevant subspace is a recurrent
problem in many different fields from Computer Vision to medical image
analysis. We propose in this paper a new learning principle and implementation
of the generative decomposition model generally known as noisy ICA (for
independent component analysis) based on the SAEM algorithm, which is a
versatile stochastic approximation of the standard EM algorithm. We demonstrate
the applicability of the method on a large range of decomposition models and
illustrate the developments with experimental results on various data sets.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS499 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Asymptotic goodness-of-fit tests for the Palm mark distribution of stationary point processes with correlated marks
We consider spatially homogeneous marked point patterns in an unboundedly
expanding convex sampling window. Our main objective is to identify the
distribution of the typical mark by constructing an asymptotic
-goodness-of-fit test. The corresponding test statistic is based on a
natural empirical version of the Palm mark distribution and a smoothed
covariance estimator which turns out to be mean square consistent. Our approach
does not require independent marks and allows dependences between the mark
field and the point pattern. Instead we impose a suitable -mixing
condition on the underlying stationary marked point process which can be
checked for a number of Poisson-based models and, in particular, in the case of
geostatistical marking. In order to study test performance, our test approach
is applied to detect anisotropy of specific Boolean models.Comment: Published in at http://dx.doi.org/10.3150/13-BEJ523 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm). arXiv admin
note: substantial text overlap with arXiv:1205.504
- …