12,312 research outputs found

    Geocoding health data with Geographic Information Systems: a pilot study in northeast Italy for developing a standardized data-acquiring format

    Get PDF
    Introduction. Geographic Information Systems (GIS) have become an innovative and somewhat crucial tool for analyzing relationships between public health data and environment. This study, though focusing on a Local Health Unit of northeastern Italy, could be taken as a benchmark for developing a standardized national data-acquiring format, providing a step-by-step instructions on the manipulation of address elements specific for Italian language and traditions. Methods. Geocoding analysis was carried out on a health database comprising 268,517 records of the Local Health Unit of Rovigo in the Veneto region, covering a period of 10 years, starting from 2001 up to 2010. The Map Service provided by the Environmental Research System Institute (ESRI, Redlands, CA), and ArcMap 10.0 by ESRI\uae were, respectively, the reference data and the GIS software, employed in the geocoding process. Results. The first attempt of geocoding produced a poor quality result, having about 40% of the addresses matched. A procedure of manual standardization was performed in order to enhance the quality of the results, consequently a set of guiding principle were expounded which should be pursued for geocoding health data. High-level geocoding detail will provide a more precise geographic representation of health related events. Conclusions. The main achievement of this study was to outline some of the difficulties encountered during the geocoding of health data and to put forward a set of guidelines, which could be useful to facilitate the process and enhance the quality of the results. Public health informatics represents an emerging specialty that highlights on the application of information science and technology to public health practice and research. Therefore, this study could draw the attention of the National Health Service to the underestimated problem of geocoding accuracy in health related data for environmental risk assessment

    A method and a tool for geocoding and record linkage

    Get PDF
    For many years, researchers have presented the geocoding of postal addresses as a challenge. Several research works have been devoted to achieve the geocoding process. This paper presents theoretical and technical aspects for geolocalization, geocoding, and record linkage. It shows possibilities and limitations of existing methods and commercial software identifying areas for further research. In particular, we present a methodology and a computing tool allowing the correction and the geo-coding of mailing addresses. The paper presents two main steps of the methodology. The first preliminary step is addresses correction (addresses matching), while the second caries geocoding of identified addresses. Additionally, we present some results from the processing of real data sets. Finally, in the discussion, areas for further research are identified.addresses correction; geocodage; matching; data management; record linkage

    Historical collaborative geocoding

    Full text link
    The latest developments in digital have provided large data sets that can increasingly easily be accessed and used. These data sets often contain indirect localisation information, such as historical addresses. Historical geocoding is the process of transforming the indirect localisation information to direct localisation that can be placed on a map, which enables spatial analysis and cross-referencing. Many efficient geocoders exist for current addresses, but they do not deal with the temporal aspect and are based on a strict hierarchy (..., city, street, house number) that is hard or impossible to use with historical data. Indeed historical data are full of uncertainties (temporal aspect, semantic aspect, spatial precision, confidence in historical source, ...) that can not be resolved, as there is no way to go back in time to check. We propose an open source, open data, extensible solution for geocoding that is based on the building of gazetteers composed of geohistorical objects extracted from historical topographical maps. Once the gazetteers are available, geocoding an historical address is a matter of finding the geohistorical object in the gazetteers that is the best match to the historical address. The matching criteriae are customisable and include several dimensions (fuzzy semantic, fuzzy temporal, scale, spatial precision ...). As the goal is to facilitate historical work, we also propose web-based user interfaces that help geocode (one address or batch mode) and display over current or historical topographical maps, so that they can be checked and collaboratively edited. The system is tested on Paris city for the 19-20th centuries, shows high returns rate and is fast enough to be used interactively.Comment: WORKING PAPE

    An evaluation framework for comparing geocoding systems

    Get PDF
    BACKGROUND: Geocoding, the process of converting textual information describing a location into one or more digital geographic representations, is a routine task performed at large organizations and government agencies across the globe. In a health context, this task is often a fundamental first step performed prior to all operations that take place in a spatially-based health study. As such, the quality of the geocoding system used within these agencies is of paramount concern to the agency (the producer) and researchers or policy-makers who wish to use these data (consumers). However, geocoding systems are continually evolving with new products coming on the market continuously. Agencies must develop and use criteria across a number axes when faced with decisions about building, buying, or maintaining any particular geocoding systems. To date, published criteria have focused on one or more aspects of geocode quality without taking a holistic view of a geocoding system’s role within a large organization. The primary purpose of this study is to develop and test an evaluation framework to assist a large organization in determining which geocoding systems will meet its operational needs.METHODS: A geocoding platform evaluation framework is derived through an examination of prior literature on geocoding accuracy. The framework developed extends commonly used geocoding metrics to take into account the specific concerns of large organizations for which geocoding is a fundamental operational capability tightly-knit into its core mission of processing health data records. A case study is performed to evaluate the strengths and weaknesses of five geocoding platforms currently available in the Australian geospatial marketplace.RESULTS: The evaluation framework developed in this research is proven successful in differentiating between key capabilities of geocoding systems that are important in the context of a large organization with significant investments in geocoding resources. Results from the proposed methodology highlight important differences across all axes of geocoding system comparisons including spatial data output accuracy, reference data coverage, system flexibility, the potential for tight integration, and the need for specialized staff and/or development time and funding. Such results can empower decisions-makers within large organizations as they make decisions and investments in geocoding systems

    ELF GeoLocator Service

    Get PDF
    Ponencias, comunicaciones y pósters presentados en el 17th AGILE Conference on Geographic Information Science "Connecting a Digital Europe through Location and Place", celebrado en la Universitat Jaume I del 3 al 6 de junio de 2014.This paper describes the implementation of a gazetteer service, GeoLocator, developed in the project ‘European Location Framework’ (ELF). The GeoLocator service contains data from the INSPIRE/ELF themes Geographical Names, Administrative Units and Addresses. The functionalities of the service include geocoding, administrative unit-limited geocoding, fuzzy geocoding, reverse geocoding and administrative unit-limited reverse geocoding

    Modeling the probability distribution of positional errors incurred by residential address geocoding

    Get PDF
    BACKGROUND: The assignment of a point-level geocode to subjects' residences is an important data assimilation component of many geographic public health studies. Often, these assignments are made by a method known as automated geocoding, which attempts to match each subject's address to an address-ranged street segment georeferenced within a streetline database and then interpolate the position of the address along that segment. Unfortunately, this process results in positional errors. Our study sought to model the probability distribution of positional errors associated with automated geocoding and E911 geocoding. RESULTS: Positional errors were determined for 1423 rural addresses in Carroll County, Iowa as the vector difference between each 100%-matched automated geocode and its true location as determined by orthophoto and parcel information. Errors were also determined for 1449 60%-matched geocodes and 2354 E911 geocodes. Huge (> 15 km) outliers occurred among the 60%-matched geocoding errors; outliers occurred for the other two types of geocoding errors also but were much smaller. E911 geocoding was more accurate (median error length = 44 m) than 100%-matched automated geocoding (median error length = 168 m). The empirical distributions of positional errors associated with 100%-matched automated geocoding and E911 geocoding exhibited a distinctive Greek-cross shape and had many other interesting features that were not capable of being fitted adequately by a single bivariate normal or t distribution. However, mixtures of t distributions with two or three components fit the errors very well. CONCLUSION: Mixtures of bivariate t distributions with few components appear to be flexible enough to fit many positional error datasets associated with geocoding, yet parsimonious enough to be feasible for nascent applications of measurement-error methodology to spatial epidemiology

    Technical aspects of Envisat-ASAR geocoding capability at DLR

    Get PDF
    Based on experience with the geocoding systems for ERS-D-PAF (GEOS), the SIR-C/X-SAR (GEOS) and SRTM missions (GeMoS), geocoding functionality has been extended for Envisat ASAR data. The existing Envisat ASAR Geocoding System (EGEO) can handle all Level 1-b image products (IMS, APS, IMP, APP, IMM, APM, WSM and GM1). Complementary to geocoded products provided by ESA (IMG, APG) the geocoding procedure applied at the German Aerospace Center (DLR) makes use of a DEM to achieve higher geolocation accuracy. The resulting geocoded image is either defined as EEC (Enhanced Ellipsoid Corrected) or as ETC (Enhanced Terrain Corrected). These products mainly differ in the underlying DEM used for geocoding. The EEC utilizes GLOBE, while the ETC utilizes the “best” DEM available in the data base. This “best” DEM can be assembled from different DEM data sets (e.g. derived from SRTM, ERS, …). Further differences such as the interpolative (EEC) and rigorous (ETC) geocoding approach will also be outlined. Furthermore, an incidence angle mask can be generated. The necessary upgrades for geocoding ASAR stripline products (e.g. IMM, WSM) will be presented. Stripline products cover a large area along track, as they consist of concatenated stand-alone products (“slices”). Thus the updates of relevant parameters have to be taken into account

    Geocoding Large Population‐level Administrative Datasets at Highly Resolved Spatial Scales

    Full text link
    Using geographic information systems to link administrative databases with demographic, social, and environmental data allows researchers to use spatial approaches to explore relationships between exposures and health. Traditionally, spatial analysis in public health has focused on the county, ZIP code, or tract level because of limitations to geocoding at highly resolved scales. Using 2005 birth and death data from North Carolina, we examine our ability to geocode population‐level datasets at three spatial resolutions – zip code, street, and parcel. We achieve high geocoding rates at all three resolutions, with statewide street geocoding rates of 88.0% for births and 93.2% for deaths. We observe differences in geocoding rates across demographics and health outcomes, with lower geocoding rates in disadvantaged populations and the most dramatic differences occurring across the urban‐rural spectrum. Our results suggest that highly resolved spatial data architectures for population‐level datasets are viable through geocoding individual street addresses. We recommend routinely geocoding administrative datasets to the highest spatial resolution feasible, allowing public health researchers to choose the spatial resolution used in analysis based on an understanding of the spatial dimensions of the health outcomes and exposures being investigated. Such research, however, must acknowledge how disparate geocoding success across subpopulations may affect findings.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/108258/1/tgis12052.pd

    Identifying Geographic Clusters: A Network Analytic Approach

    Get PDF
    In recent years there has been a growing interest in the role of networks and clusters in the global economy. Despite being a popular research topic in economics, sociology and urban studies, geographical clustering of human activity has often studied been by means of predetermined geographical units such as administrative divisions and metropolitan areas. This approach is intrinsically time invariant and it does not allow one to differentiate between different activities. Our goal in this paper is to present a new methodology for identifying clusters, that can be applied to different empirical settings. We use a graph approach based on k-shell decomposition to analyze world biomedical research clusters based on PubMed scientific publications. We identify research institutions and locate their activities in geographical clusters. Leading areas of scientific production and their top performing research institutions are consistently identified at different geographic scales
    corecore