886 research outputs found

    Modeling and estimation of multi-source clustering in crime and security data

    Full text link
    While the presence of clustering in crime and security event data is well established, the mechanism(s) by which clustering arises is not fully understood. Both contagion models and history independent correlation models are applied, but not simultaneously. In an attempt to disentangle contagion from other types of correlation, we consider a Hawkes process with background rate driven by a log Gaussian Cox process. Our inference methodology is an efficient Metropolis adjusted Langevin algorithm for filtering of the intensity and estimation of the model parameters. We apply the methodology to property and violent crime data from Chicago, terrorist attack data from Northern Ireland and Israel, and civilian casualty data from Iraq. For each data set we quantify the uncertainty in the levels of contagion vs. history independent correlation.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS647 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    The EM Algorithm and the Rise of Computational Biology

    Get PDF
    In the past decade computational biology has grown from a cottage industry with a handful of researchers to an attractive interdisciplinary field, catching the attention and imagination of many quantitatively-minded scientists. Of interest to us is the key role played by the EM algorithm during this transformation. We survey the use of the EM algorithm in a few important computational biology problems surrounding the "central dogma"; of molecular biology: from DNA to RNA and then to proteins. Topics of this article include sequence motif discovery, protein sequence alignment, population genetics, evolutionary models and mRNA expression microarray data analysis.Comment: Published in at http://dx.doi.org/10.1214/09-STS312 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    The EM Algorithm in Genetics, Genomics and Public Health

    Full text link
    The popularity of the EM algorithm owes much to the 1977 paper by Dempster, Laird and Rubin. That paper gave the algorithm its name, identified the general form and some key properties of the algorithm and established its broad applicability in scientific research. This review gives a nontechnical introduction to the algorithm for a general scientific audience, and presents a few examples characteristic of its application.Comment: Published in at http://dx.doi.org/10.1214/08-STS270 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Nonparametric ridge estimation

    Full text link
    We study the problem of estimating the ridges of a density function. Ridge estimation is an extension of mode finding and is useful for understanding the structure of a density. It can also be used to find hidden structure in point cloud data. We show that, under mild regularity conditions, the ridges of the kernel density estimator consistently estimate the ridges of the true density. When the data are noisy measurements of a manifold, we show that the ridges are close and topologically similar to the hidden manifold. To find the estimated ridges in practice, we adapt the modified mean-shift algorithm proposed by Ozertem and Erdogmus [J. Mach. Learn. Res. 12 (2011) 1249-1286]. Some numerical experiments verify that the algorithm is accurate.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1218 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Baryonic Effects on Lagrangian Clustering and Angular Momentum Reconstruction

    Get PDF
    Recent studies illustrate the correlation between the angular momenta of cosmic structures and their Lagrangian properties. However, only baryons are observable and it is unclear whether they reliably trace the cosmic angular momenta. We study the Lagrangian mass distribution, spin correlation, and predictability of dark matter, gas, and stellar components of galaxy-halo systems using IllustrisTNG, and show that the primordial segregations between components are typically small. Their protoshapes are also similar in terms of the statistics of moment of inertia tensors. Under the common gravitational potential they are expected to exert the same tidal torque and the strong spin correlations are not destroyed by the nonlinear evolution and complicated baryonic effects, as confirmed by the high-resolution hydrodynamic simulations. We further show that their late-time angular momenta traced by total gas, stars, or the central galaxies, can be reliably reconstructed by the initial perturbations. These results suggest that baryonic angular momenta can potentially be used in reconstructing the parameters and models related to the initial perturbations.Peer reviewe

    A stochastic algorithm for probabilistic independent component analysis

    Full text link
    The decomposition of a sample of images on a relevant subspace is a recurrent problem in many different fields from Computer Vision to medical image analysis. We propose in this paper a new learning principle and implementation of the generative decomposition model generally known as noisy ICA (for independent component analysis) based on the SAEM algorithm, which is a versatile stochastic approximation of the standard EM algorithm. We demonstrate the applicability of the method on a large range of decomposition models and illustrate the developments with experimental results on various data sets.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS499 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Asymptotic goodness-of-fit tests for the Palm mark distribution of stationary point processes with correlated marks

    Full text link
    We consider spatially homogeneous marked point patterns in an unboundedly expanding convex sampling window. Our main objective is to identify the distribution of the typical mark by constructing an asymptotic χ2\chi^2-goodness-of-fit test. The corresponding test statistic is based on a natural empirical version of the Palm mark distribution and a smoothed covariance estimator which turns out to be mean square consistent. Our approach does not require independent marks and allows dependences between the mark field and the point pattern. Instead we impose a suitable β\beta-mixing condition on the underlying stationary marked point process which can be checked for a number of Poisson-based models and, in particular, in the case of geostatistical marking. In order to study test performance, our test approach is applied to detect anisotropy of specific Boolean models.Comment: Published in at http://dx.doi.org/10.3150/13-BEJ523 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm). arXiv admin note: substantial text overlap with arXiv:1205.504
    • …
    corecore