8 research outputs found

    Adjusting for sampling variability in sparse data: geostatistical approaches to disease mapping

    Get PDF
    Abstract Background Disease maps of crude rates from routinely collected health data indexed at a small geographical resolution pose specific statistical problems due to the sparse nature of the data. Spatial smoothers allow areas to borrow strength from neighboring regions to produce a more stable estimate of the areal value. Geostatistical smoothers are able to quantify the uncertainty in smoothed rate estimates without a high computational burden. In this paper, we introduce a uniform model extension of Bayesian Maximum Entropy (UMBME) and compare its performance to that of Poisson kriging in measures of smoothing strength and estimation accuracy as applied to simulated data and the real data example of HIV infection in North Carolina. The aim is to produce more reliable maps of disease rates in small areas to improve identification of spatial trends at the local level. Results In all data environments, Poisson kriging exhibited greater smoothing strength than UMBME. With the simulated data where the true latent rate of infection was known, Poisson kriging resulted in greater estimation accuracy with data that displayed low spatial autocorrelation, while UMBME provided more accurate estimators with data that displayed higher spatial autocorrelation. With the HIV data, UMBME performed slightly better than Poisson kriging in cross-validatory predictive checks, with both models performing better than the observed data model with no smoothing. Conclusions Smoothing methods have different advantages depending upon both internal model assumptions that affect smoothing strength and external data environments, such as spatial correlation of the observed data. Further model comparisons in different data environments are required to provide public health practitioners with guidelines needed in choosing the most appropriate smoothing method for their particular health dataset

    Influence of Detection Method and Study Area Scale on Syphilis Cluster Identification in North Carolina

    Get PDF
    Identifying geographical clusters of sexually transmitted infections can aid in targeting prevention and control efforts. However, detectable clusters can vary between detection methods because of different underlying assumptions. Furthermore, because disease burden is not geographically homogenous, the reference population is sensitive to the study area scale, affecting cluster outcomes. We investigated the influence of cluster detection method and geographical scale on syphilis cluster detection in Mecklenburg County, North Carolina

    Adjusting for sampling variability in sparse data: geostatistical approaches to disease mapping

    Get PDF
    Abstract Background Disease maps of crude rates from routinely collected health data indexed at a small geographical resolution pose specific statistical problems due to the sparse nature of the data. Spatial smoothers allow areas to borrow strength from neighboring regions to produce a more stable estimate of the areal value. Geostatistical smoothers are able to quantify the uncertainty in smoothed rate estimates without a high computational burden. In this paper, we introduce a uniform model extension of Bayesian Maximum Entropy (UMBME) and compare its performance to that of Poisson kriging in measures of smoothing strength and estimation accuracy as applied to simulated data and the real data example of HIV infection in North Carolina. The aim is to produce more reliable maps of disease rates in small areas to improve identification of spatial trends at the local level. Results In all data environments, Poisson kriging exhibited greater smoothing strength than UMBME. With the simulated data where the true latent rate of infection was known, Poisson kriging resulted in greater estimation accuracy with data that displayed low spatial autocorrelation, while UMBME provided more accurate estimators with data that displayed higher spatial autocorrelation. With the HIV data, UMBME performed slightly better than Poisson kriging in cross-validatory predictive checks, with both models performing better than the observed data model with no smoothing. Conclusions Smoothing methods have different advantages depending upon both internal model assumptions that affect smoothing strength and external data environments, such as spatial correlation of the observed data. Further model comparisons in different data environments are required to provide public health practitioners with guidelines needed in choosing the most appropriate smoothing method for their particular health dataset

    Geomasking sensitive health data and privacy protection: an evaluation using an E911 database

    Get PDF
    Geomasking is used to provide privacy protection for individual address information while maintaining spatial resolution for mapping purposes. Donut geomasking and other random perturbation geomasking algorithms rely on the assumption of a homogeneously distributed population to calculate displacement distances, leading to possible under-protection of individuals when this condition is not met. Using household data from 2007, we evaluated the performance of donut geomasking in Orange County, North Carolina. We calculated the estimated k-anonymity for every household based on the assumption of uniform household distribution. We then determined the actual k-anonymity by revealing household locations contained in the county E911 database. Census block groups in mixed-use areas with high population distribution heterogeneity were the most likely to have privacy protection below selected criteria. For heterogeneous populations, we suggest tripling the minimum displacement area in the donut to protect privacy with a less than 1% error rate

    Progression of a large syphilis outbreak in rural North Carolina through space and time: Application of a Bayesian Maximum Entropy graphical user interface

    Get PDF
    In 2001, the primary and secondary syphilis incidence rate in rural Columbus County, North Carolina was the highest in the nation. To understand the development of syphilis outbreaks in rural areas, we developed and used the Bayesian Maximum Entropy Graphical User Interface (BMEGUI) to map syphilis incidence rates from 1999–2004 in seven adjacent counties in North Carolina. Using BMEGUI, incidence rate maps were constructed for two aggregation scales (ZIP code and census tract) with two approaches (Poisson and simple kriging). The BME maps revealed the outbreak was initially localized in Robeson County and possibly connected to more urban endemic cases in adjacent Cumberland County. The outbreak spread to rural Columbus County in a leapfrog pattern with the subsequent development of a visible low incidence spatial corridor linking Roberson County with the rural areas of Columbus County. Though the data are from the early 2000s, they remain pertinent, as the combination of spatial data with the extensive sexual network analyses, particularly in rural areas gives thorough insights which have not been replicated in the past two decades. These observations support an important role for the connection of micropolitan areas with neighboring rural areas in the spread of syphilis. Public health interventions focusing on urban and micropolitan areas may effectively limit syphilis indirectly in nearby rural areas

    Mapping Health Data: Improved Privacy Protection With Donut Method Geomasking

    Get PDF
    A major challenge in mapping health data is protecting patient privacy while maintaining the spatial resolution necessary for spatial surveillance and outbreak identification. A new adaptive geomasking technique, referred to as the donut method, extends current methods of random displacement by ensuring a user-defined minimum level of geoprivacy. In donut method geomasking, each geocoded address is relocated in a random direction by at least a minimum distance, but less than a maximum distance. The authors compared the donut method with current methods of random perturbation and aggregation regarding measures of privacy protection and cluster detection performance by masking multiple disease field simulations under a range of parameters. Both the donut method and random perturbation performed better than aggregation in cluster detection measures. The performance of the donut method in geoprivacy measures was at least 42.7% higher and in cluster detection measures was less than 4.8% lower than that of random perturbation. Results show that the donut method provides a consistently higher level of privacy protection with a minimal decrease in cluster detection performance, especially in areas where the risk to individual geoprivacy is greatest
    corecore