413 research outputs found

    A COMPREHENSIVE GEOSPATIAL KNOWLEDGE DISCOVERY FRAMEWORK FOR SPATIAL ASSOCIATION RULE MINING

    Get PDF
    Continuous advances in modern data collection techniques help spatial scientists gain access to massive and high-resolution spatial and spatio-temporal data. Thus there is an urgent need to develop effective and efficient methods seeking to find unknown and useful information embedded in big-data datasets of unprecedentedly large size (e.g., millions of observations), high dimensionality (e.g., hundreds of variables), and complexity (e.g., heterogeneous data sources, space–time dynamics, multivariate connections, explicit and implicit spatial relations and interactions). Responding to this line of development, this research focuses on the utilization of the association rule (AR) mining technique for a geospatial knowledge discovery process. Prior attempts have sidestepped the complexity of the spatial dependence structure embedded in the studied phenomenon. Thus, adopting association rule mining in spatial analysis is rather problematic. Interestingly, a very similar predicament afflicts spatial regression analysis with a spatial weight matrix that would be assigned a priori, without validation on the specific domain of application. Besides, a dependable geospatial knowledge discovery process necessitates algorithms supporting automatic and robust but accurate procedures for the evaluation of mined results. Surprisingly, this has received little attention in the context of spatial association rule mining. To remedy the existing deficiencies mentioned above, the foremost goal for this research is to construct a comprehensive geospatial knowledge discovery framework using spatial association rule mining for the detection of spatial patterns embedded in geospatial databases and to demonstrate its application within the domain of crime analysis. It is the first attempt at delivering a complete geo-spatial knowledge discovery framework using spatial association rule mining

    Progression of a large syphilis outbreak in rural North Carolina through space and time: Application of a Bayesian Maximum Entropy graphical user interface

    Get PDF
    In 2001, the primary and secondary syphilis incidence rate in rural Columbus County, North Carolina was the highest in the nation. To understand the development of syphilis outbreaks in rural areas, we developed and used the Bayesian Maximum Entropy Graphical User Interface (BMEGUI) to map syphilis incidence rates from 1999–2004 in seven adjacent counties in North Carolina. Using BMEGUI, incidence rate maps were constructed for two aggregation scales (ZIP code and census tract) with two approaches (Poisson and simple kriging). The BME maps revealed the outbreak was initially localized in Robeson County and possibly connected to more urban endemic cases in adjacent Cumberland County. The outbreak spread to rural Columbus County in a leapfrog pattern with the subsequent development of a visible low incidence spatial corridor linking Roberson County with the rural areas of Columbus County. Though the data are from the early 2000s, they remain pertinent, as the combination of spatial data with the extensive sexual network analyses, particularly in rural areas gives thorough insights which have not been replicated in the past two decades. These observations support an important role for the connection of micropolitan areas with neighboring rural areas in the spread of syphilis. Public health interventions focusing on urban and micropolitan areas may effectively limit syphilis indirectly in nearby rural areas

    A historical GIS for England and Wales: a framework for reconstructing past geographies and analysing long-term change

    Get PDF
    PhDThis thesis describes the creation and possible uses of a Geographical Information System that contains the changing boundaries of the major administrative units of England and Wales from 1840 to 1974. For over 150 years the census, the General Register Office, and others have used these units to publish a wealth of data concerning the population of the country. The key issue addressed by the thesis is that changes in the administrative geography have hampered much research on long-term change in society that could have been done using these sources. The goal of the thesis is the creation of framework to allow the analysis of long-term socio-economic change that makes maximum use of the available data. This involves not only making use of the data's attribute (statistical) component, but also their spatial and temporal components. In order to do this, the thesis provides solutions to two key problems: the first is how to build a GIS containing administrative units that incorporates an accurate record of their changing boundaries and can be linked to statistical data in a flexible manner. The second is how to remove the impact of boundary changes when comparing datasets published at different dates. This is done by devising a methodology for interpolating data from the administrative units they were published using, onto a single target geography. An evaluation of the accuracy of this interpolation is performed and examples are given of how this type of research could be conducted. Taken together, these will release information locked up within historical socio-economic statistics by allowing space to be explicitly incorporated into any explorations of the data. This, in turn, allows research to explore the past with increased levels of both spatial and attribute data for longer time periods

    Processing aggregated data : the location of clusters in health data

    Get PDF
    Spatially aggregated data is frequently used in geographical applications. Often spatial data analysis on aggregated data is performed in the same way as on exact data, which ignores the fact that we do not know the actual locations of the data. We here propose models and methods to take aggregation into account. For this we focus on the problem of locating clusters in aggregated data. More specifically, we study the problem of locating clusters in spatially aggregated health data. The data is given as a subdivision into regions with two values per region, the number of cases and the size of the population at risk. We formulate the problem as finding a placement of a cluster window of a given shape such that a cluster function depending on the population at risk and the cases is maximized. We propose area-based models to calculate the cases (and the population at risk) within a cluster window. These models are based on the areas of intersection of the cluster window with the regions of the subdivision. We show how to compute a subdivision such that within each cell of the subdivision the areas of intersection are simple functions. We evaluate experimentally how taking aggregation into account influences the location of the clusters found.Peer ReviewedPostprint (published version

    Processing aggregated data : the location of clusters in health data

    No full text
    Spatially aggregated data is frequently used in geographical applications. Often spatial data analysis on aggregated data is performed in the same way as on exact data, which ignores the fact that we do not know the actual locations of the data. We here propose models and methods to take aggregation into account. For this we focus on the problem of locating clusters in aggregated data. More specifically, we study the problem of locating clusters in spatially aggregated health data. The data is given as a subdivision into regions with two values per region, the number of cases and the size of the population at risk. We formulate the problem as finding a placement of a cluster window of a given shape such that a cluster function depending on the population at risk and the cases is maximized. We propose area-based models to calculate the cases (and the population at risk) within a cluster window. These models are based on the areas of intersection of the cluster window with the regions of the subdivision. We show how to compute a subdivision such that within each cell of the subdivision the areas of intersection are simple functions. We evaluate experimentally how taking aggregation into account influences the location of the clusters found.Peer ReviewedPostprint (published version

    Ecological studies of Clostridioides difficile and COVID-19 infection with the application of space-time risk models

    Get PDF
    Date on title page is 2021. Degree awarded in 2022.Infectious diseases continue to pose major global health threats. With the recent devastation from the COVID-19 pandemic and growing concerns of healthcare-associated infections (HAIs), there is a worldwide requirement for stringent techniques to monitor and understand the key drivers for infections. Infectious diseases have an inherent spatial dimension due to the contagious nature of viruses and bacteria. This thesis aims to explore the use of spatial and spatio-temporal techniques applied to infections, specifically Clostridiodies difficile infection (CDI) and COVID-19, to identify risk factors at an ecological population-based level. A mixture of open-sourced and routinely collected data, at different spatial scales, were used to understand the surveillance capacities of observational public health data. Antimicrobial prescribing and stewardship have been a global focus in the last decade as concerns have grown with emergent novel antibiotic-resistant infections. CDI has been shown to have a well-defined association with certain broad-spectrum antibiotic classes and other environmental factors, however, there is a gap in the literature aiming to understand these relationships ecologically and spatially. The main focus of this thesis was to use spatio-temporal models to investigate spatial risk factors of CDI incidence, such as GP antimicrobial prescribing, in Scotland and Wales. Similar spatial techniques were then applied to investigate the spatial distribution of COVID-19 testing during the first wave of the 2020 epidemic in Scotland. The relevant spatial and spatio-temporal models applied throughout this thesis were initially discussed in Chapter 2. The spatial distribution of Scottish GP antibiotic prescribing rates, from 2016 to 2018, was investigated in Chapter 3 using spatial point-location correlation methods. Risk factors of increased GP antibiotic prescribing were explored, showing GP practice demographic information as key drivers of increased antibiotic prescribing. These analyses were followed by an exploration of Scottish CDI incidence data, from 2014 to 2018, at a small areal level (intermediate zones (IZ)), to understand spatial auto-correlation and temporal trends of CDI incidence in Chapter 4. Population demographic risk factors, as highlighted in the literature, were obtained at the same spatial scale and assessed as ecological risk factors of CDI incidence using conditional autoregressive (CAR) models. The next phase of this thesis then combined the previous two analyses, introducing a multi-level spatial problem, which aimed to explore central risk factors of CDI that were not available at the same spatial scale in Chapter 5. Spatial interpolation methods were applied to manipulate GP antibiotic prescribing point-location data and areal-unit cattle density data to match the CDI incidence at an IZ spatial scale. These data could then be explored as ecological risk factors of CDI incidence, carrying forward the previously defined CAR model from Chapter 4 and adjusting for demographic confounders. Welsh CDI incidence and primary care antibiotic prescribing data offered the opportunity to compare between two countries in the UK. The retrospective ecological study in Chapter 6 used aggregated disease surveillance data to understand the impact of total and high-risk Welsh GP antibiotic prescribing on total and stratified inpatient/noninpatient CDI incidence. Location and health board information were anonymised preventing a formal spatial analysis, however, the results were comparable to previous chapter findings and supported the hypothesis of an increased risk of CDI incidence reflected in GP antibiotic prescribing rates, particularly high-risk antibiotics, and population demographics. Finally, at the beginning of the COVID-19 pandemic, it became evident that the methodologies applied in this thesis could support the investigation of the spread of COVID-19 infections. The work presented in Chapter 7 aimed to explore how best to capture spatial patterns of community COVID-19 infection by conducting a spatiotemporal analysis on three data streams { positive test rates, relevant NHS24 calls and COVID Symptom Study (CSS) predicted cases, to assess which was best for early disease surveillance. Results showed both sources to identify similar trends of COVID-19 and gold-standard testing data, particularly when used in parallel. This thesis has provided new insights into the associated risks between CDI incidence and GP antibiotic prescribing in Scotland and Wales, demonstrating the capabilities of open-source and routinely collected public health data when applied in a spatial framework. These results support the requirement of stringent measures to reduce antibiotic prescribing in the community. It also highlights the beneficial use and suitability of analysing infectious disease data with spatial techniques to address gaps in the literature to understand population-based risk factors of disease. There is a strong argument for future research into methods of analysing multi-level spatial data, particularly in the application of observational public health data.Infectious diseases continue to pose major global health threats. With the recent devastation from the COVID-19 pandemic and growing concerns of healthcare-associated infections (HAIs), there is a worldwide requirement for stringent techniques to monitor and understand the key drivers for infections. Infectious diseases have an inherent spatial dimension due to the contagious nature of viruses and bacteria. This thesis aims to explore the use of spatial and spatio-temporal techniques applied to infections, specifically Clostridiodies difficile infection (CDI) and COVID-19, to identify risk factors at an ecological population-based level. A mixture of open-sourced and routinely collected data, at different spatial scales, were used to understand the surveillance capacities of observational public health data. Antimicrobial prescribing and stewardship have been a global focus in the last decade as concerns have grown with emergent novel antibiotic-resistant infections. CDI has been shown to have a well-defined association with certain broad-spectrum antibiotic classes and other environmental factors, however, there is a gap in the literature aiming to understand these relationships ecologically and spatially. The main focus of this thesis was to use spatio-temporal models to investigate spatial risk factors of CDI incidence, such as GP antimicrobial prescribing, in Scotland and Wales. Similar spatial techniques were then applied to investigate the spatial distribution of COVID-19 testing during the first wave of the 2020 epidemic in Scotland. The relevant spatial and spatio-temporal models applied throughout this thesis were initially discussed in Chapter 2. The spatial distribution of Scottish GP antibiotic prescribing rates, from 2016 to 2018, was investigated in Chapter 3 using spatial point-location correlation methods. Risk factors of increased GP antibiotic prescribing were explored, showing GP practice demographic information as key drivers of increased antibiotic prescribing. These analyses were followed by an exploration of Scottish CDI incidence data, from 2014 to 2018, at a small areal level (intermediate zones (IZ)), to understand spatial auto-correlation and temporal trends of CDI incidence in Chapter 4. Population demographic risk factors, as highlighted in the literature, were obtained at the same spatial scale and assessed as ecological risk factors of CDI incidence using conditional autoregressive (CAR) models. The next phase of this thesis then combined the previous two analyses, introducing a multi-level spatial problem, which aimed to explore central risk factors of CDI that were not available at the same spatial scale in Chapter 5. Spatial interpolation methods were applied to manipulate GP antibiotic prescribing point-location data and areal-unit cattle density data to match the CDI incidence at an IZ spatial scale. These data could then be explored as ecological risk factors of CDI incidence, carrying forward the previously defined CAR model from Chapter 4 and adjusting for demographic confounders. Welsh CDI incidence and primary care antibiotic prescribing data offered the opportunity to compare between two countries in the UK. The retrospective ecological study in Chapter 6 used aggregated disease surveillance data to understand the impact of total and high-risk Welsh GP antibiotic prescribing on total and stratified inpatient/noninpatient CDI incidence. Location and health board information were anonymised preventing a formal spatial analysis, however, the results were comparable to previous chapter findings and supported the hypothesis of an increased risk of CDI incidence reflected in GP antibiotic prescribing rates, particularly high-risk antibiotics, and population demographics. Finally, at the beginning of the COVID-19 pandemic, it became evident that the methodologies applied in this thesis could support the investigation of the spread of COVID-19 infections. The work presented in Chapter 7 aimed to explore how best to capture spatial patterns of community COVID-19 infection by conducting a spatiotemporal analysis on three data streams { positive test rates, relevant NHS24 calls and COVID Symptom Study (CSS) predicted cases, to assess which was best for early disease surveillance. Results showed both sources to identify similar trends of COVID-19 and gold-standard testing data, particularly when used in parallel. This thesis has provided new insights into the associated risks between CDI incidence and GP antibiotic prescribing in Scotland and Wales, demonstrating the capabilities of open-source and routinely collected public health data when applied in a spatial framework. These results support the requirement of stringent measures to reduce antibiotic prescribing in the community. It also highlights the beneficial use and suitability of analysing infectious disease data with spatial techniques to address gaps in the literature to understand population-based risk factors of disease. There is a strong argument for future research into methods of analysing multi-level spatial data, particularly in the application of observational public health data

    Urban design and drug crime: uncovering the spatial logic of drug crime in relation to the urban street network and land use mosaic in London

    Get PDF
    This multidisciplinary research is concerned with the ways in which the morphology of the urban landscape may affect the spatial distribution of drug crime incidents. Following from this rationale, the research pursued the following three objectives. First, the research explored where drug dealers are known to sell drugs, and the extent to which and in what ways these places differ from those places that they do not. In particular, the research focused on examining whether the types of places at which drugs are sold have the street network characteristics of places that offer good retail potential. Employing space syntax technique and event count regression models, the analysis showed that street permeability and proximity to high street significantly increase the likelihood of drug crime. Second, the research examined drug crime in relation to legal facilities, which inherently and routinely generate large flows of people. Using network distance buffers, the criminogenic fields of the facilities were identified. The regression results showed that not only the facility itself attracts crime, but the facility’s specific configurational positioning on the street network also influences the likelihood of crime. The last part of the research examined the relative positioning of drug dealing locations in the city with reference to the level of permeability, the drug types and quantities being sold per street segments. The results showed a spatial differentiation amongst varying drug types according to their drug classes. The overall picture suggested that the urban fabric, particularly the characteristics of the street network configuration and the way land uses are distributed across the street network, have a great effect on drug occurrences

    Offender Residential Concentrations: A Longitudinal Study in Birmingham, England

    Get PDF
    The overarching aim of this thesis is to advance understanding into the geographic distribution of offender residences, that is, where known offenders live. Although this strand of research emerged amidst the earliest studies in spatial criminology, contemporary research has since favoured the examination of offences, much at the expense of offender residences. This shift has occurred despite there being strong theoretical and empirical reasons for studying both. To revive interest into offender residences, and achieve the aim of this thesis, three key themes are identified through a comprehensive review of existing literature, relating to spatial scale, longitudinal stability and explanation. From these, three research questions are posed, the answers to which constitute the original contribution of this thesis. Firstly, what is the most appropriate spatial scale to study offender residential concentrations? Secondly, to what extent do offender residential concentrations demonstrate stability over time? Thirdly, how can we explain the longitudinal (in)stability of offender residential concentrations? To answer these research questions, analysis is conducted on longitudinal police recorded data of known offender residences in Birmingham between 2007 and 2016, supplied by West Midlands Police Force, and census data under Open Government Licence. The methods deployed are largely inspired by the (considerably more advanced) offence strand of research, and include descriptive statistics, extensive (spatial) visualisations, multilevel variance partitions, novel longitudinal clustering techniques and spatially lagged multivariable regression models. Findings suggest that small (‘micro’) spatial scales are most suitable for studying the geography of offender residences. The degree to which concentrations demonstrate longitudinal (in)stability varies by the methods deployed, but findings suggest a reasonable degree of volatility over time, some of which is due to the individuallevel residential mobility of offenders. Longitudinal trends can be explained by a number of demographic characteristics, including deprivation, ethnic diversity and housing tenure. Discussions emerge from these findings which have implications for methodology, theory and policy, opening prospect to generate avenues for future research

    An information statistics approach to zone design in the geography of health outcomes and provision

    Get PDF
    Social scientists and policy makers are usually faced with boundary changes in administrative areas over time. The control of spatial issues deriving from boundary changes is even more important when they affect the organisation and allocation of resources in a national health system. In addition, the problem becomes more acute when the health organisations analyse sensitive data using geographies constructed to serve other administrative purposes. In recent literature, the modifiable nature of areas is reflected in the modifiable areal unit problem (MAUP) and widely acknowledged frameworks for geographical analysis are developed targeting to overcome this problem. The aim of this research is to suggest and develop methodologies supporting the health related studies to provide valuable decisions. In order to achieve this aim the following research objectives have been developed. In this thesis, the crucial objective is to identify how geographical problems are related to health policies exploring available methodologies and suggesting solutions derived from informative statistic measures to unresolved practical issues. Consequently, an automated computer system developed formulating these problems in graph theory context and utilising their components through object oriented algorithms. The test and evaluation of the system is applied in a series of case studies investigating the effects of MAUP in various geographies and aggregation levels. The overall objective provides strategies and valuable practice for using the system as well as suggesting areas of health research that may benefit from the methodology. In the final chapter, the thesis concludes with a summary of findings and limitations for the suggested methodology providing an outline of the research directions for further work into the spatial issues in relation to health research.EThOS - Electronic Theses Online ServiceHellenic State Scholarship Foundation (HSSF)GBUnited Kingdo
    corecore