40 research outputs found

    Spatial Multiresolution Cluster Detection Method

    Get PDF
    A novel multi-resolution cluster detection (MCD) method is proposed to identify irregularly shaped clusters in space. Multi-scale test statistic on a single cell is derived based on likelihood ratio statistic for Bernoulli sequence, Poisson sequence and Normal sequence. A neighborhood variability measure is defined to select the optimal test threshold. The MCD method is compared with single scale testing methods controlling for false discovery rate and the spatial scan statistics using simulation and f-MRI data. The MCD method is shown to be more effective for discovering irregularly shaped clusters, and the implementation of this method does not require heavy computation, making it suitable for cluster detection for large spatial data

    Spatially clustered associations in health GIS

    Get PDF
    Overlaying maps using a desktop GIS is often the first step of a multivariate spatial analysis. The potential of this operation has increased considerably as data sources and Web services to manipulate them are becoming widely available via the Internet. Standards from the OGC enable such geospatial mashups to be seamless and user driven, involving discovery of thematic data. The user is naturally inclined to look for spatial clusters and correlation of outcomes. Using classical cluster detection scan methods to identify multivariate associations can be problematic in this context, because of a lack of control on or knowledge about background populations. For public health and epidemiological mapping, this limiting factor can be critical but often the focus is on spatial identification of risk factors associated with health or clinical status. Spatial entropy index HSu for the ScankOO analysis of the hypothetical dataset using a vicinity which is fixed by the number of points without distinction between their labels. (The size of the labels is proportional to the inverse of the index) In this article we point out that this association itself can ensure some control on underlying populations, and develop an exploratory scan statistic framework for multivariate associations. Inference using statistical map methodologies can be used to test the clustered associations. The approach is illustrated with a hypothetical data example and an epidemiological study on community MRSA. Scenarios of potential use for online mashups are introduced but full implementation is left for further research

    A spatial scan statistic for zero-inflated Poisson process

    Full text link
    The scan statistic is widely used in spatial cluster detection applications of inhomogeneous Poisson processes. However, real data may present substantial departure from the underlying Poisson process. One of the possible departures has to do with zero excess. Some studies point out that when applied to data with excess zeros, the spatial scan statistic may produce biased inferences. In this work, we develop a closed-form scan statistic for cluster detection of spatial zero-inflated count data. We apply our methodology to simulated and real data. Our simulations revealed that the Scan-Poisson statistic steadily deteriorates as the number of zeros increases, producing biased inferences. On the other hand, our proposed Scan-ZIP and Scan-ZIP+EM statistics are, most of the time, either superior or comparable to the Scan-Poisson statistic

    Cluster Detection Tests in Spatial Epidemiology: A Global Indicator for Performance Assessment

    Get PDF
    International audienceIn cluster detection of disease, the use of local cluster detection tests (CDTs) is current. These methods aim both at locating likely clusters and testing for their statistical significance. New or improved CDTs are regularly proposed to epidemiologists and must be subjected to performance assessment. Because location accuracy has to be considered, performance assessment goes beyond the raw estimation of type I or II errors. As no consensus exists for performance evaluations, heterogeneous methods are used, and therefore studies are rarely comparable. A global indicator of performance, which assesses both spatial accuracy and usual power, would facilitate the exploration of CDTs behaviour and help between-studies comparisons. The Tanimoto coefficient (TC) is a well-known measure of similarity that can assess location accuracy but only for one detected cluster. In a simulation study, performance is measured for many tests. From the TC, we here propose two statistics, the averaged TC and the cumulated TC, as indicators able to provide a global overview of CDTs performance for both usual power and location accuracy. We evidence the properties of these two indicators and the superiority of the cumulated TC to assess performance. We tested these indicators to conduct a systematic spatial assessment displayed through performance maps

    Evaluating the disparity of female breast cancer mortality among racial groups - a spatiotemporal analysis

    Get PDF
    BACKGROUND: The literature suggests that the distribution of female breast cancer mortality demonstrates spatial concentration. There remains a lack of studies on how the mortality burden may impact racial groups across space and over time. The present study evaluated the geographic variations in breast cancer mortality in Texas females according to three predominant racial groups (non-Hispanic White, Black, and Hispanic females) over a twelve-year period. It sought to clarify whether the spatiotemporal trend might place an uneven burden on particular racial groups, and whether the excess trend has persisted into the current decade. METHODS: The Spatial Scan Statistic was employed to examine the geographic excess of breast cancer mortality by race in Texas counties between 1990 and 2001. The statistic was conducted with a scan window of a maximum of 90% of the study period and a spatial cluster size of 50% of the population at risk. The next scan was conducted with a purely spatial option to verify whether the excess mortality persisted further. Spatial queries were performed to locate the regions of excess mortality affecting multiple racial groups. RESULTS: The first scan identified 4 regions with breast cancer mortality excess in both non-Hispanic White and Hispanic female populations. The most likely excess mortality with a relative risk of 1.12 (p = 0.001) occurred between 1990 and 1996 for non-Hispanic Whites, including 42 Texas counties along Gulf Coast and Central Texas. For Hispanics, West Texas with a relative risk of 1.18 was the most probable region of excess mortality (p = 0.001). Results of the second scan were identical to the first. This suggested that the excess mortality might not persist to the present decade. Spatial queries found that 3 counties in Southeast and 9 counties in Central Texas had excess mortality involving multiple racial groups. CONCLUSION: Spatiotemporal variations in breast cancer mortality affected racial groups at varying levels. There was neither evidence of hot-spot clusters nor persistent spatiotemporal trends of excess mortality into the present decade. Non-Hispanic Whites in the Gulf Coast and Hispanics in West Texas carried the highest burden of mortality, as evidenced by spatial concentration and temporal persistence

    Segregation Indices for Disease Clustering

    Full text link
    Spatial clustering has important implications in various fields. In particular, disease clustering is of major public concern in epidemiology. In this article, we propose the use of two distance-based segregation indices to test the significance of disease clustering among subjects whose locations are from a homogeneous or an inhomogeneous population. We derive their asymptotic distributions and compare them with other distance-based disease clustering tests in terms of empirical size and power by extensive Monte Carlo simulations. The null pattern we consider is the random labeling (RL) of cases and controls to the given locations. Along this line, we investigate the sensitivity of the size of these tests to the underlying background pattern (e.g., clustered or homogenous) on which the RL is applied, the level of clustering and number of clusters, or differences in relative abundances of the classes. We demonstrate that differences in relative abundance has the highest impact on the empirical sizes of the tests. We also propose various non-RL patterns as alternatives to the RL pattern and assess the empirical power performance of the tests under these alternatives. We illustrate the methods on two real-life examples from epidemiology.Comment: 31 pages, 13 figures, 3 table
    corecore