1,260 research outputs found

    Developing tools and models for evaluating geospatial data integration of official and VGI data sources

    Get PDF
    PhD ThesisIn recent years, systems have been developed which enable users to produce, share and update information on the web effectively and freely as User Generated Content (UGC) data (including Volunteered Geographic Information (VGI)). Data quality assessment is a major concern for supporting the accurate and efficient spatial data integration required if VGI is to be used alongside official, formal, usually governmental datasets. This thesis aims to develop tools and models for the purpose of assessing such integration possibilities. Initially, in order to undertake this task, geometrical similarity of formal and informal data was examined. Geometrical analyses were performed by developing specific programme interfaces to assess the positional, linear and polygon shape similarity among reference field survey data (FS); official datasets such as data from Ordnance Survey (OS), UK and General Directorate for Survey (GDS), Iraq agencies; and VGI information such as OpenStreetMap (OSM) datasets. A discussion of the design and implementation of these tools and interfaces is presented. A methodology has been developed to assess such positional and shape similarity by applying different metrics and standard indices such as the National Standard for Spatial Data Accuracy (NSSDA) for positional quality; techniques such as buffering overlays for linear similarity; and application of moments invariant for polygon shape similarity evaluations. The results suggested that difficulties exist for any geometrical integration of OSM data with both bench mark FS and formal datasets, but that formal data is very close to reference datasets. An investigation was carried out into contributing factors such as data sources, feature types and number of data collectors that may affect the geometrical quality of OSM data and consequently affect the integration process of OSM datasets with FS, OS and GDS. Factorial designs were undertaken in this study in order to develop and implement an experiment to discover the effect of these factors individually and the interaction between each of them. The analysis found that data source is the most significant factor that affects the geometrical quality of OSM datasets, and that there are interactions among all these factors at different levels of interaction. This work also investigated the possibility of integrating feature classification of official datasets such as data from OS and GDS geospatial data agencies, and informal datasets such as OSM information. In this context, two different models were developed. The first set of analysis included the evaluation of semantic integration of corresponding feature classifications of compared datasets. The second model was concerned with assessing the ability of XML schema matching of feature classifications of tested datasets. This initially involved a tokenization process in order to split up into single words classifications that were composed of multiple words. Subsequently, encoding feature classifications as XML schema trees was undertaken. The semantic similarity, data type similarity and structural similarity were measured between the nodes of compared schema trees. Once these three similarities had been computed, a weighted combination technique has been adopted in order to obtain the overall similarity. The findings of both sets of analysis were not encouraging as far as the possibility of effectively integrating feature classifications of VGI datasets, such as OSM information, and formal datasets, such as OS and GDS datasets, is concerned.Ministry of Higher Education and Scientific Research, Republic of Iraq

    Map Conflation using Piecewise Linear Rubber-Sheeting Transformation between Layout and As-Built Plans in Kumasi Metropolis.

    Get PDF
    Context and backgroundAccurately integrating different geospatial data sets remain a challenging task because diverse geospatial data may have different accuracy levels and formats. Surveyors may typically create several arbitrary coordinate systems at local scales, which could lead to a variety of coordinate datasets causing such data to remain unconsolidated and in-homogeneous.Methodology:In this study, a piecewise rubber-sheeting conflation or geometric correction approach is used to accomplish transformations between such a pair of data for accurate data integration. Rubber-sheeting or piecewise linear homeomorphism is necessary because the different plans’ data would rarely match up correctly due to various reasons, such as the method of setting out from the design to the ground situation, and/or the non-accommodation of existing developments in the design.  Results:The conflation in ArcGIS using rubber sheet transformation achieved integration to a mean displacement error of 1.58 feet (0.48 meters.) from an initial mean displacement error of 71.46 feet (21.78 meters) an improvement of almost 98%. It is recommended that the rubber sheet technique gave a near exact point matching transformation and could be used to integrate zone plans with As-built surveys to address the challenges in correcting zonal plans in land records.  It is further recommended to investigate the incorporation of the use of textual information recognition and address geocoding to enable the use of on-site road names and plot numbers to detect points for matching

    Geospatial crowdsourced data fitness analysis for spatial data infrastructure based disaster management actions

    Get PDF
    The reporting of disasters has changed from official media reports to citizen reporters who are at the disaster scene. This kind of crowd based reporting, related to disasters or any other events, is often identified as 'Crowdsourced Data' (CSD). CSD are freely and widely available thanks to the current technological advancements. The quality of CSD is often problematic as it is often created by the citizens of varying skills and backgrounds. CSD is considered unstructured in general, and its quality remains poorly defined. Moreover, the CSD's location availability and the quality of any available locations may be incomplete. The traditional data quality assessment methods and parameters are also often incompatible with the unstructured nature of CSD due to its undocumented nature and missing metadata. Although other research has identified credibility and relevance as possible CSD quality assessment indicators, the available assessment methods for these indicators are still immature. In the 2011 Australian floods, the citizens and disaster management administrators used the Ushahidi Crowd-mapping platform and the Twitter social media platform to extensively communicate flood related information including hazards, evacuations, help services, road closures and property damage. This research designed a CSD quality assessment framework and tested the quality of the 2011 Australian floods' Ushahidi Crowdmap and Twitter data. In particular, it explored a number of aspects namely, location availability and location quality assessment, semantic extraction of hidden location toponyms and the analysis of the credibility and relevance of reports. This research was conducted based on a Design Science (DS) research method which is often utilised in Information Science (IS) based research. Location availability of the Ushahidi Crowdmap and the Twitter data assessed the quality of available locations by comparing three different datasets i.e. Google Maps, OpenStreetMap (OSM) and Queensland Department of Natural Resources and Mines' (QDNRM) road data. Missing locations were semantically extracted using Natural Language Processing (NLP) and gazetteer lookup techniques. The Credibility of Ushahidi Crowdmap dataset was assessed using a naive Bayesian Network (BN) model commonly utilised in spam email detection. CSD relevance was assessed by adapting Geographic Information Retrieval (GIR) relevance assessment techniques which are also utilised in the IT sector. Thematic and geographic relevance were assessed using Term Frequency – Inverse Document Frequency Vector Space Model (TF-IDF VSM) and NLP based on semantic gazetteers. Results of the CSD location comparison showed that the combined use of non-authoritative and authoritative data improved location determination. The semantic location analysis results indicated some improvements of the location availability of the tweets and Crowdmap data; however, the quality of new locations was still uncertain. The results of the credibility analysis revealed that the spam email detection approaches are feasible for CSD credibility detection. However, it was critical to train the model in a controlled environment using structured training including modified training samples. The use of GIR techniques for CSD relevance analysis provided promising results. A separate relevance ranked list of the same CSD data was prepared through manual analysis. The results revealed that the two lists generally agreed which indicated the system's potential to analyse relevance in a similar way to humans. This research showed that the CSD fitness analysis can potentially improve the accuracy, reliability and currency of CSD and may be utilised to fill information gaps available in authoritative sources. The integrated and autonomous CSD qualification framework presented provides a guide for flood disaster first responders and could be adapted to support other forms of emergencies

    Exploratory analysis of OpenStreetMap for land use classification

    Get PDF
    In the last years, volunteers have been contributing massively to what we know nowadays as Volunteered Geographic Information. This huge amount of data might be hiding a vast geographical richness and therefore research needs to be conducted to explore their potential and use it in the solution of real world problems. In this study we conduct an exploratory analysis of data from the OpenStreetMap initiative. Using the Corine Land Cover database as reference and continental Portugal as the study area, we establish a possible correspondence between both classification nomenclatures, evaluate the quality of OpenStreetMap polygon features classification against Corine Land Cover classes from level 1 nomenclature, and analyze the spatial distribution of OpenStreetMap classes over continental Portugal. A global classification accuracy around 76% and interesting coverage areas’ values are remarkable and promising results that encourages us for future research on this topic

    Enhanced Place Name Search Using Semantic Gazetteers

    Get PDF
    With the increased availability of geospatial data and efficient geo-referencing services, people are now more likely to engage in geospatial searches for information on the Web. Searching by address is supported by geocoding which converts an address to a geographic coordinate. Addresses are one form of geospatial referencing that are relatively well understood and easy for people to use, but place names are generally the most intuitive natural language expressions that people use for locations. This thesis presents an approach, for enhancing place name searches with a geo-ontology and a semantically enabled gazetteer. This approach investigates the extension of general spatial relationships to domain specific semantically rich concepts and spatial relationships. Hydrography is selected as the domain, and the thesis investigates the specification of semantic relationships between hydrographic features as functions of spatial relationships between their footprints. A Gazetteer Ontology (GazOntology) based on ISO Standards is developed to associate a feature with a Spatial Reference. The Spatial Reference can be a GeoIdentifier which is a text based representation of a feature usually a place name or zip code or the spatial reference can be a Geometry representation which is a spatial footprint of the feature. A Hydrological Features Ontology (HydroOntology) is developed to model canonical forms of hydrological features and their hydrological relationships. The classes modelled are endurant classes modelled in foundational ontologies such as DOLCE. Semantics of these relationships in a hydrological context are specified in a HydroOntology. The HydroOntology and GazOntology can be viewed as the semantic schema for the HydroGazetteer. The HydroGazetteer was developed as an RDF triplestore and populated with instances of named hydrographic features from the National Hydrography Dataset (NHD) for several watersheds in the state of Maine. In order to determine what instances of surface hydrology features participate in the specified semantic relationships, information was obtained through spatial analysis of the National Hydrography Dataset (NHD), the NHDPlus data set and the Geographic Names Information System (GNIS). The 9 intersection model between point, line, directed line, and region geometries which identifies sets of relationship between geometries independent of what these geometries represent in the world provided the basis for identifying semantic relationships between the canonical hydrographic feature types. The developed ontologies enable the HydroGazetteer to answer different categories of queries, namely place name queries involving the taxonomy of feature types, queries on relations between named places, and place name queries with reasoning. A simple user interface to select a hydrological relationship and a hydrological feature name was developed and the results are displayed on a USGS topographic base map. The approach demonstrates that spatial semantics can provide effective query disambiguation and more targeted spatial queries between named places based on relationships such as upstream, downstream, or flows through

    Geoinformatics in Citizen Science

    Get PDF
    The book features contributions that report original research in the theoretical, technological, and social aspects of geoinformation methods, as applied to supporting citizen science. Specifically, the book focuses on the technological aspects of the field and their application toward the recruitment of volunteers and the collection, management, and analysis of geotagged information to support volunteer involvement in scientific projects. Internationally renowned research groups share research in three areas: First, the key methods of geoinformatics within citizen science initiatives to support scientists in discovering new knowledge in specific application domains or in performing relevant activities, such as reliable geodata filtering, management, analysis, synthesis, sharing, and visualization; second, the critical aspects of citizen science initiatives that call for emerging or novel approaches of geoinformatics to acquire and handle geoinformation; and third, novel geoinformatics research that could serve in support of citizen science

    AUTHORITATIVE CARTOGRAPHY IN BRAZIL AND COLLABORATIVE MAPPING PLATFORMS: CHALLENGES AND PROPOSALS FOR DATA INTEGRATION

    Get PDF
    Brazil has a large area with missing or outdated mapping on the largest scales of its authoritative mapping. The use of data from collaborative mapping platforms appears as an alternative that may contribute to minimizing this problem, either by updating or completing the mapping coverage in Brazil, as proposed or performed by some National Mapping Agencies abroad. The present work aims to analyze a methodology to provide accurate and documented integration of volunteered geographic information and the Brazilian authoritative mapping. The proposal starts with the semantic compatibility between the conceptual models adopted in both official cartography and OpenStreetMap platform. The research continues with the identification of object classes with the most significant potential for integration. Finally, we developed some experiments to evaluate and validate the OSM data integration process in a 1:25,000 scale cartographic database. Even in regions with a recent mapping, the results of the preliminary assessment indicate the potential for an increase of about 52% and 16% of features in the ‘road system’ category, which suggests a very promising method for use in areas with missing or outdated mapping, and its applicability to other categories

    Assessing the accuracy of openstreetmap data in south africa for the purpose of integrating it with authoritative data

    Get PDF
    Includes bibliographical references.The introduction and success of Volunteered Geographic Information (VGI) has gained the interest of National Mapping Agencies (NMAs) worldwide. VGI is geographic information that is freely generated by non-experts and shared using VGI initiatives available on the Internet. The NMA of South Africa i.e. the Chief Directorate: National Geo- Spatial Information (CD: NGI) is looking to this volunteer information to maintain their topographical database; however, the main concern is the quality of the data. The purpose of this work is to assess whether it is feasible to use VGI to update the CD: NGI topographical database. The data from OpenStreetMap (OSM), which is one the most successful VGI initiatives, was compared to a reference data set provided by the CD: NGI. Corresponding features between the two data sets were compared in order to assess the various quality aspects. The investigation was split into quantitative and qualitative assessments. The aim of the quantitative assessments was to determine the internal quality of the OSM data. The internal quality elements included the positional accuracy, geometric accuracy, semantic accuracy and the completeness. The _rst part of the qualitative assessment was concerned with the currency of OSM data between 2006 and 2012. The second part of the assessment was focused on the uniformity of OSM data acquisition across South Africa. The quantitative results showed that both road and building features do not meet the CD: NGI positional accuracy standards. In some areas the positional accuracy of roads are close to the required accuracy. The buildings generally compare well in shape to the CD: NGI buildings. However, there were very few OSM polygon features to assess, thus the results are limited to a small sample. The semantic accuracy of roads was low. Volunteers do not generally classify roads correctly. Instead, many volunteers prefer to class roads generically. The last part of the quantitative results, the completeness, revealed that commercial areas reach high completeness percentages and sometimes exceed the total length of the CD: NGI roads. In residential areas, the percentages are lower and in low urban density areas, the lowest. Nonetheless, the OSM repository has seen signi_cant growth since 2006. The qualitative results showed that because the OSM repository has continued to grow since 2006, the level of currency has increased. In South Africa, the most contributions were made between 2010 and 2012. The OSM data set is thus current after 2012. The amount and type of contributions are however not uniform across the country for various reasons. The number of point contributions was low. Thus, the relationship between the type of contribution and the settlement type could not be made with certainty. Because the OSM data does not meet the CD: NGI spatial accuracy requirements, the two data sets cannot be integrated at the database level. Instead, two options are proposed. The CD: NGI could use the OSM data for detecting changes to the landscape only. The other recommendation is to transform and verify the OSM data. Only those features with a high positional accuracy would then be ingested. The CD: NGI currently has a shortage of sta_ that is quali_ed to process ancillary data. Both of the options proposed thus require automated techniques because it is time consuming to perform these tasks manually

    GEOBIA 2016 : Solutions and Synergies., 14-16 September 2016, University of Twente Faculty of Geo-Information and Earth Observation (ITC): open access e-book

    Get PDF
    • …
    corecore