40 research outputs found
Spatial Multiresolution Cluster Detection Method
A novel multi-resolution cluster detection (MCD) method is proposed to
identify irregularly shaped clusters in space. Multi-scale test statistic on a
single cell is derived based on likelihood ratio statistic for Bernoulli
sequence, Poisson sequence and Normal sequence. A neighborhood variability
measure is defined to select the optimal test threshold. The MCD method is
compared with single scale testing methods controlling for false discovery rate
and the spatial scan statistics using simulation and f-MRI data. The MCD method
is shown to be more effective for discovering irregularly shaped clusters, and
the implementation of this method does not require heavy computation, making it
suitable for cluster detection for large spatial data
Spatially clustered associations in health GIS
Overlaying maps using a desktop GIS is often the first step of a multivariate spatial analysis. The potential of this operation has increased considerably as data sources and Web services to manipulate them are becoming widely available via the Internet. Standards from the OGC enable such geospatial mashups to be seamless and user driven, involving discovery of thematic data. The user is naturally inclined to look for spatial clusters and correlation of outcomes. Using classical cluster detection scan methods to identify multivariate associations can be problematic in this context, because of a lack of control on or knowledge about background populations. For public health and epidemiological mapping, this limiting factor can be critical but often the focus is on spatial identification of risk factors associated with health or clinical status. Spatial entropy index HSu for the ScankOO analysis of the hypothetical dataset using a vicinity which is fixed by the number of points without distinction between their labels. (The size of the labels is proportional to the inverse of the index) In this article we point out that this association itself can ensure some control on underlying populations, and develop an exploratory scan statistic framework for multivariate associations. Inference using statistical map methodologies can be used to test the clustered associations. The approach is illustrated with a hypothetical data example and an epidemiological study on community MRSA. Scenarios of potential use for online mashups are introduced but full implementation is left for further research
A spatial scan statistic for zero-inflated Poisson process
The scan statistic is widely used in spatial cluster detection applications
of inhomogeneous Poisson processes. However, real data may present substantial
departure from the underlying Poisson process. One of the possible departures
has to do with zero excess. Some studies point out that when applied to data
with excess zeros, the spatial scan statistic may produce biased inferences. In
this work, we develop a closed-form scan statistic for cluster detection of
spatial zero-inflated count data. We apply our methodology to simulated and
real data. Our simulations revealed that the Scan-Poisson statistic steadily
deteriorates as the number of zeros increases, producing biased inferences. On
the other hand, our proposed Scan-ZIP and Scan-ZIP+EM statistics are, most of
the time, either superior or comparable to the Scan-Poisson statistic
Recommended from our members
Tango's maximized excess events test with different weights
BACKGROUND: Tango's maximized excess events test (MEET) has been shown to have very good statistical power in detecting global disease clustering. A nice feature of this test is that it considers a range of spatial scale parameters, adjusting for the multiple testing. This means that it has good power to detect a wide range of clustering processes. The test depends on the functional form of a weight function, and it is unknown how sensitive the test is to the choice of this weight function and what function provides optimal power for different clustering processes. In this study, we evaluate the performance of the test for a wide range of weight functions. RESULTS: The power varies greatly with different choice of weight. Tango's original choice for the weight function works very well. There are also other weight functions that provide good power. CONCLUSION: We recommend the use of Tango's MEET to test global disease clustering, either with the original weight or one of the alternate weights that have good power
Recommended from our members
Power evaluation of disease clustering tests
BACKGROUND: Many different test statistics have been proposed to test for spatial clustering. Some of these statistics have been widely used in various applications. In this paper, we use an existing collection of 1,220,000 simulated benchmark data, generated under 51 different clustering models, to compare the statistical power of several disease clustering tests. These tests are Besag-Newell's R, Cuzick-Edwards' k-Nearest Neighbors (k-NN), the spatial scan statistic, Tango's Maximized Excess Events Test (MEET), Swartz' entropy test, Whittemore's test, Moran's I and a modification of Moran's I. RESULTS: Except for Moran's I and Whittemore's test, all other tests have good power for detecting some kind of clustering. The spatial scan statistic is good at detecting localized clusters. Tango's MEET is good at detecting global clustering. With appropriate choice of parameter, Besag-Newell's R and Cuzick-Edwards' k-NN also perform well. CONCLUSION: The power varies greatly for different test statistics and alternative clustering models. Consideration of the power is important before we decide which test statistic to use
Recommended from our members
The Spatial Structure of Autism in California, 1993-2001
This article identifies significant high-risk clusters of autism based on residence at birth in California for children born from 1993 to 2001. These clusters are geographically stable. Children born in a primary cluster are at four times greater risk for autism than children living in other parts of the state. This is comparable to the difference between males and females and twice the risk estimated for maternal age over 40. In every year roughly 3% of the new caseload of autism in California arises from the primary cluster we identify-a small zone 20 km by 50 km. We identify a set of secondary clusters that support the existence of the primary clusters. The identification of robust spatial clusters indicates that autism does not arise from a global treatment and indicates that important drivers of increased autism prevalence are located at the local level
Cluster Detection Tests in Spatial Epidemiology: A Global Indicator for Performance Assessment
International audienceIn cluster detection of disease, the use of local cluster detection tests (CDTs) is current. These methods aim both at locating likely clusters and testing for their statistical significance. New or improved CDTs are regularly proposed to epidemiologists and must be subjected to performance assessment. Because location accuracy has to be considered, performance assessment goes beyond the raw estimation of type I or II errors. As no consensus exists for performance evaluations, heterogeneous methods are used, and therefore studies are rarely comparable. A global indicator of performance, which assesses both spatial accuracy and usual power, would facilitate the exploration of CDTs behaviour and help between-studies comparisons. The Tanimoto coefficient (TC) is a well-known measure of similarity that can assess location accuracy but only for one detected cluster. In a simulation study, performance is measured for many tests. From the TC, we here propose two statistics, the averaged TC and the cumulated TC, as indicators able to provide a global overview of CDTs performance for both usual power and location accuracy. We evidence the properties of these two indicators and the superiority of the cumulated TC to assess performance. We tested these indicators to conduct a systematic spatial assessment displayed through performance maps
Evaluating the disparity of female breast cancer mortality among racial groups - a spatiotemporal analysis
BACKGROUND: The literature suggests that the distribution of female breast cancer mortality demonstrates spatial concentration. There remains a lack of studies on how the mortality burden may impact racial groups across space and over time. The present study evaluated the geographic variations in breast cancer mortality in Texas females according to three predominant racial groups (non-Hispanic White, Black, and Hispanic females) over a twelve-year period. It sought to clarify whether the spatiotemporal trend might place an uneven burden on particular racial groups, and whether the excess trend has persisted into the current decade. METHODS: The Spatial Scan Statistic was employed to examine the geographic excess of breast cancer mortality by race in Texas counties between 1990 and 2001. The statistic was conducted with a scan window of a maximum of 90% of the study period and a spatial cluster size of 50% of the population at risk. The next scan was conducted with a purely spatial option to verify whether the excess mortality persisted further. Spatial queries were performed to locate the regions of excess mortality affecting multiple racial groups. RESULTS: The first scan identified 4 regions with breast cancer mortality excess in both non-Hispanic White and Hispanic female populations. The most likely excess mortality with a relative risk of 1.12 (p = 0.001) occurred between 1990 and 1996 for non-Hispanic Whites, including 42 Texas counties along Gulf Coast and Central Texas. For Hispanics, West Texas with a relative risk of 1.18 was the most probable region of excess mortality (p = 0.001). Results of the second scan were identical to the first. This suggested that the excess mortality might not persist to the present decade. Spatial queries found that 3 counties in Southeast and 9 counties in Central Texas had excess mortality involving multiple racial groups. CONCLUSION: Spatiotemporal variations in breast cancer mortality affected racial groups at varying levels. There was neither evidence of hot-spot clusters nor persistent spatiotemporal trends of excess mortality into the present decade. Non-Hispanic Whites in the Gulf Coast and Hispanics in West Texas carried the highest burden of mortality, as evidenced by spatial concentration and temporal persistence
Segregation Indices for Disease Clustering
Spatial clustering has important implications in various fields. In
particular, disease clustering is of major public concern in epidemiology. In
this article, we propose the use of two distance-based segregation indices to
test the significance of disease clustering among subjects whose locations are
from a homogeneous or an inhomogeneous population. We derive their asymptotic
distributions and compare them with other distance-based disease clustering
tests in terms of empirical size and power by extensive Monte Carlo
simulations. The null pattern we consider is the random labeling (RL) of cases
and controls to the given locations. Along this line, we investigate the
sensitivity of the size of these tests to the underlying background pattern
(e.g., clustered or homogenous) on which the RL is applied, the level of
clustering and number of clusters, or differences in relative abundances of the
classes. We demonstrate that differences in relative abundance has the highest
impact on the empirical sizes of the tests. We also propose various non-RL
patterns as alternatives to the RL pattern and assess the empirical power
performance of the tests under these alternatives. We illustrate the methods on
two real-life examples from epidemiology.Comment: 31 pages, 13 figures, 3 table