65 research outputs found

    Adaptive lasso and Dantzig selector for spatial point processes intensity estimation

    Full text link
    Lasso and Dantzig selector are standard procedures able to perform variable selection and estimation simultaneously. This paper is concerned with extending these procedures to spatial point process intensity estimation. We propose adaptive versions of these procedures, develop efficient computational methodologies and derive asymptotic results for a large class of spatial point processes under an original setting where the number of parameters, i.e. the number of spatial covariates considered, increases with the expected number of data points. Both procedures are compared theoretically, in a simulation study, and in a real data example.Comment: 30 page

    Nonconcave penalized composite conditional likelihood estimation of sparse Ising models

    Full text link
    The Ising model is a useful tool for studying complex interactions within a system. The estimation of such a model, however, is rather challenging, especially in the presence of high-dimensional parameters. In this work, we propose efficient procedures for learning a sparse Ising model based on a penalized composite conditional likelihood with nonconcave penalties. Nonconcave penalized likelihood estimation has received a lot of attention in recent years. However, such an approach is computationally prohibitive under high-dimensional Ising models. To overcome such difficulties, we extend the methodology and theory of nonconcave penalized likelihood to penalized composite conditional likelihood estimation. The proposed method can be efficiently implemented by taking advantage of coordinate-ascent and minorization--maximization principles. Asymptotic oracle properties of the proposed method are established with NP-dimensionality. Optimality of the computed local solution is discussed. We demonstrate its finite sample performance via simulation studies and further illustrate our proposal by studying the Human Immunodeficiency Virus type 1 protease structure based on data from the Stanford HIV drug resistance database. Our statistical learning results match the known biological findings very well, although no prior biological information is used in the data analysis procedure.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1017 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A nonparametric penalized likelihood approach to density estimation of space–time point patterns

    Get PDF
    In this work, we consider space-time point processes and study their continuous space-time evolution. We propose an innovative nonparametric methodology to estimate the unknown space-time density of the point pattern, or, equivalently, to estimate the intensity of an inhomogeneous space-time Poisson point process. The presented approach combines maximum likelihood estimation with roughness penalties, based on differential operators, defined over the spatial and temporal domains of interest. We first establish some important theoretical properties of the considered estimator, including its consistency. We then develop an efficient and flexible estimation procedure that leverages advanced numerical and computation techniques. Thanks to a discretization based on finite elements in space and B-splines in time, the proposed method can effectively capture complex multi-modal and strongly anisotropic spatio-temporal point patterns; moreover, these point patterns may be observed over planar or curved domains with non -trivial geometries, due to geographic constraints, such as coastal regions with complicated shorelines, or curved regions with complex orography. In addition to providing estimates, the method's functionalities also include the introduction of appropriate uncertainty quantification tools. We thoroughly validate the proposed method, by means of simulation studies and applications to real-world data. The obtained results highlight significant advantages over state-of-the-art competing approaches

    Information criteria for inhomogeneous spatial point processes

    Full text link
    The theoretical foundation for a number of model selection criteria is established in the context of inhomogeneous point processes and under various asymptotic settings: infill, increasing domain, and combinations of these. For inhomogeneous Poisson processes we consider Akaike information criterion and the Bayesian information criterion, and in particular we identify the point process analogue of sample size needed for the Bayesian information criterion. Considering general inhomogeneous point processes we derive new composite likelihood and composite Bayesian information criteria for selecting a regression model for the intensity function. The proposed model selection criteria are evaluated using simulations of Poisson processes and cluster point processes.Comment: 6 figure

    Statistical Inference for Structured Spatial and Temporal Point Data

    Get PDF
    The availability of large-scale spatial and temporal data has fueled increasing interest in statistical modelling and analysis. With the recent development of data collection and data storage techniques, the observation scopes can sometimes involve a extremely vast range or an explosive amount of cases. Then this always leads to an inevitable focus that there tend to be some heterogeneous properties among observations. Thus, the research was conducted to explain the variability in spatial or temporal data considering the correlation of observations. We first considered the intensity estimation problem for large spatial point patterns on complex domains in R2 (e.g., domains with irregular boundaries, sharp concavities, and/or interior holes due to geographic constraints) and linear networks, where many existing spatial point process models suffer from the problems of “leakage" and computation. We proposed an efficient intensity estimation algorithm to estimate the spatially varying intensity function and to study the varying relationship between intensity and explanatory variables on complex domains. The method is built upon a graph regularization technique and hence can be flexibly applied to point patterns on complex domains such as regions with irregular boundaries and holes, or linear networks. An efficient proximal gradient optimization algorithm is proposed to handle large spatial point patterns. Numerical studies were conducted to illustrate the performance of the method. Besides, we apply the method to study and visualize the intensity patterns of the accidents on the Western Australia road network, and the spatial variations in the effects of income, lights condition, and population density on the Toronto homicides occurrences. In addition, the spatial inhomogeneity occurred in various scenarios, especially for the data laying in a vast-scale space. we further established a spatially adaptive sampling design approach based in an estimation of the spatially varying underlying contamination distribution. This part of research was motivated by an Arsenic exposure data which were collected through drinking water in private wells across the Iowa state. From the public and environmental health management perspective, it is critical to allocate the limited resources to establish an effective arsenic sampling and testing plan for health risk mitigation. we propose a statistical regularization method to automatically detect spatial clusters of the underlying contamination risk from the currently available private well arsenic testing data in the USA, Iowa. This approach allows us to develop a sampling design method that is adaptive to the changes in the contamination risk across the identified clusters. Finally, we further looked into the cluster issues in structured temporal point data. How to cluster event sequences from heterogeneous point processes is a challenging task, especially when event sequences are repeatedly observed and associated with multiple event types. To solve this problem, we proposed an efficient model-based clustering framework, based on a novel multivariate mixture of functional point processes (MFPP). The proposed model generated event sequences from a multi-level log-Gaussian Cox process, which allows to uncover complex inner patterns among sequences, by imposing multiple latent random effects. We prove the identifiability of our mixture model and developed an effective semi-parametric Exponential-Solution (ES) algorithm to the proposed model. The effectiveness of the proposed framework is demonstrated through simulation studies and real data analyses

    Semi-parametric models for multivariate point pattern data

    Get PDF
    • …
    corecore