28 research outputs found

    Score, Pseudo-Score and Residual Diagnostics for Spatial Point Process Models

    Full text link
    We develop new tools for formal inference and informal model validation in the analysis of spatial point pattern data. The score test is generalized to a "pseudo-score" test derived from Besag's pseudo-likelihood, and to a class of diagnostics based on point process residuals. The results lend theoretical support to the established practice of using functional summary statistics, such as Ripley's KK-function, when testing for complete spatial randomness; and they provide new tools such as the compensator of the KK-function for testing other fitted models. The results also support localization methods such as the scan statistic and smoothed residual plots. Software for computing the diagnostics is provided.Comment: Published in at http://dx.doi.org/10.1214/11-STS367 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A spatio-temporal hybrid Strauss hardcore point process for forest fire occurrences

    Full text link
    We propose a new point process model that combines, in the spatio-temporal setting, both multi-scaling by hybridization and hardcore distances. Our so-called hybrid Strauss hardcore point process model allows different types of interaction, at different spatial and/or temporal scales, that might be of interest in environmental and biological applications. The inference and simulation of the model are implemented using the logistic likelihood approach and the birth-death Metropolis-Hastings algorithm. Our model is illustrated and compared to others on two datasets of forest fire occurrences respectively in France and Spain

    A cross-validation-based statistical theory for point processes

    Get PDF
    Motivated by cross-validation’s general ability to reduce overfitting and mean square error, we develop a cross-validation-based statistical theory for general point processes. It is based on the combination of two novel concepts for general point processes: cross-validation and prediction errors. Our cross-validation approach uses thinning to split a point process/pattern into pairs of training and validation sets, while our prediction errors measure discrepancy between two point processes. The new statistical approach, which may be used to model different distributional characteristics, exploits the prediction errors to measure how well a given model predicts validation sets using associated training sets. Having indicated that our new framework generalizes many existing statistical approaches, we then establish different theoretical properties for it, including large sample properties. We further recognize that non-parametric intensity estimation is an instance of Papangelou conditional intensity estimation, which we exploit to apply our new statistical theory to kernel intensity estimation. Using independent thinning-based cross-validation, we numerically show that the new approach substantially outperforms the state of the art in bandwidth selection. Finally, we carry out intensity estimation for a dataset in forestry (Euclidean domain) and a dataset in neurology (linear network)

    Forest Fire Risk Assessment Using Point Process Modeling & Monte Carlo Fire Simulation: A Case Study in Gyeongju, South Korea

    Get PDF
    Forest fire risk assessment becomes critical for developing forest and fire management strategies in Korea since the magnitude of damage from fires significantly increased over the past decades. Fire behavior probability is one of the major components in quantifying fire risk, and is often presented as burn probability. Burn probability estimation requires a proper estimation of fire occurrence probability because fire spread is largely influenced by ignition locations in addition to other environmental factors, such as weather, topography, and land covers. The objective of this study is to assess forest fire risk over a large forested landscape in and around the City of Gyeongju, Republic of Korea, while incorporating fire occurrence probability into estimation of burn probability. A fire occurrence probability model with spatial covariates and autocorrelation was developed using historical record of fire occurrence between 1991 and 2012 and a spatial point processing (SPP) method. A total of 502 fire ignition points were generated using the fire occurrence probability model. Monte Carlo fire spread simulations were performed from the ignition points under the 95% extreme weather scenario, resulting in burn probability estimation for each land parcel across the landscape. Finally, the burn probability was combined with government-appraised land property value to assess potential loss value per land parcel due to forest fires. The density of forest fires of the study landscape was associated with lower elevation, moderate slope, coniferous land cover, distance to road, and higher tomb density. A positive spatial autocorrelations between the locations of fire ignition was also found. An area-interaction point process model including the spatial covariate effects and interpoint interaction term appeared to be suitable as a fire occurrence probability model. A correlation analysis among the fire occurrence probability, burn probability, land property value, and potential value loss indicates that fire risk is largely associated with spatial pattern of burn probability (Pearson’s correlation =0.7084). These results can provide forest and fire management authorities in the study region with useful information for decision making. It is also hoped that the methodology presented here can provide an improved framework for assessing fire risk in other regions

    Contributions to statistical analysis methods for neural spiking activity

    Full text link
    With the technical advances in neuroscience experiments in the past few decades, we have seen a massive expansion in our ability to record neural activity. These advances enable neuroscientists to analyze more complex neural coding and communication properties, and at the same time, raise new challenges for analyzing neural spiking data, which keeps growing in scale, dimension, and complexity. This thesis proposes several new statistical methods that advance statistical analysis approaches for neural spiking data, including sequential Monte Carlo (SMC) methods for efficient estimation of neural dynamics from membrane potential threshold crossings, state-space models using multimodal observation processes, and goodness-of-fit analysis methods for neural marked point process models. In a first project, we derive a set of iterative formulas that enable us to simulate trajectories from stochastic, dynamic neural spiking models that are consistent with a set of spike time observations. We develop a SMC method to simultaneously estimate the parameters of the model and the unobserved dynamic variables from spike train data. We investigate the performance of this approach on a leaky integrate-and-fire model. In another project, we define a semi-latent state-space model to estimate information related to the phenomenon of hippocampal replay. Replay is a recently discovered phenomenon where patterns of hippocampal spiking activity that typically occur during exploration of an environment are reactivated when an animal is at rest. This reactivation is accompanied by high frequency oscillations in hippocampal local field potentials. However, methods to define replay mathematically remain undeveloped. In this project, we construct a novel state-space model that enables us to identify whether replay is occurring, and if so to estimate the movement trajectories consistent with the observed neural activity, and to categorize the content of each event. The state-space model integrates information from the spiking activity from the hippocampal population, the rhythms in the local field potential, and the rat's movement behavior. Finally, we develop a new, general time-rescaling theorem for marked point processes, and use this to develop a general goodness-of-fit framework for neural population spiking models. We investigate this approach through simulation and a real data application

    Kernelized Stein Discrepancy Tests of Goodness-of-fit for Time-to-Event Data

    Get PDF
    Survival Analysis and Reliability Theory are concerned with the analysis of time-to-event data, in which observations correspond to waiting times until an event of interest such as death from a particular disease or failure of a component in a mechanical system. This type of data is unique due to the presence of censoring, a type of missing data that occurs when we do not observe the actual time of the event of interest but, instead, we have access to an approximation for it given by random interval in which the observation is known to belong. Most traditional methods are not designed to deal with censoring, and thus we need to adapt them to censored time-to-event data. In this paper, we focus on non-parametric goodness-of-fit testing procedures based on combining the Stein's method and kernelized discrepancies. While for uncensored data, there is a natural way of implementing a kernelized Stein discrepancy test, for censored data there are several options, each of them with different advantages and disadvantages. In this paper, we propose a collection of kernelized Stein discrepancy tests for time-to-event data, and we study each of them theoretically and empirically; our experimental results show that our proposed methods perform better than existing tests, including previous tests based on a kernelized maximum mean discrepancy.Comment: Proceedings of the International Conference on Machine Learning, 202
    corecore