2,435 research outputs found

    Multispectral Image Road Extraction Based Upon Automated Map Conflation

    Get PDF
    Road network extraction from remotely sensed imagery enables many important and diverse applications such as vehicle tracking, drone navigation, and intelligent transportation studies. There are, however, a number of challenges to road detection from an image. Road pavement material, width, direction, and topology vary across a scene. Complete or partial occlusions caused by nearby buildings, trees, and the shadows cast by them, make maintaining road connectivity difficult. The problems posed by occlusions are exacerbated with the increasing use of oblique imagery from aerial and satellite platforms. Further, common objects such as rooftops and parking lots are made of materials similar or identical to road pavements. This problem of common materials is a classic case of a single land cover material existing for different land use scenarios. This work addresses these problems in road extraction from geo-referenced imagery by leveraging the OpenStreetMap digital road map to guide image-based road extraction. The crowd-sourced cartography has the advantages of worldwide coverage that is constantly updated. The derived road vectors follow only roads and so can serve to guide image-based road extraction with minimal confusion from occlusions and changes in road material. On the other hand, the vector road map has no information on road widths and misalignments between the vector map and the geo-referenced image are small but nonsystematic. Properly correcting misalignment between two geospatial datasets, also known as map conflation, is an essential step. A generic framework requiring minimal human intervention is described for multispectral image road extraction and automatic road map conflation. The approach relies on the road feature generation of a binary mask and a corresponding curvilinear image. A method for generating the binary road mask from the image by applying a spectral measure is presented. The spectral measure, called anisotropy-tunable distance (ATD), differs from conventional measures and is created to account for both changes of spectral direction and spectral magnitude in a unified fashion. The ATD measure is particularly suitable for differentiating urban targets such as roads and building rooftops. The curvilinear image provides estimates of the width and orientation of potential road segments. Road vectors derived from OpenStreetMap are then conflated to image road features by applying junction matching and intermediate point matching, followed by refinement with mean-shift clustering and morphological processing to produce a road mask with piecewise width estimates. The proposed approach is tested on a set of challenging, large, and diverse image data sets and the performance accuracy is assessed. The method is effective for road detection and width estimation of roads, even in challenging scenarios when extensive occlusion occurs

    Automated conflation framework for integrating transportation big datasets

    Get PDF
    The constant merging of the data, commonly known as Conflation, from various sources, has been a vital part for any phase of development, be it planning, governing the existing system or to study the effects of any intervention in the system. Conflation allows enriching the existing data by integrating information through numerous sources available out there. This process becomes unusually critical because of the complexities these diverse data bring along such as, distinct accuracies with which data has been collected, projections, diverse nomenclature adaption, etc., and hence demands special attention. Although conflation has always been a topic of interest among researchers, this area has witnessed a significant enthusiasm recently due to current advancements in the data collection methods. Even though with this escalation in interest, the developed methods didn't justify the expansions field of data collections has made. Contemporary conflation algorithms still lack an efficient automated technique; most of the existing system demands some sort of human involvement for the analysis to achieve higher accuracy. Through this work, an effort has been made to establish a fully automated process to conflate the road segments of Missouri state from two big data sources. Taking the traditional conflation a step further, this study has also focused on enriching the road segments with traffic information like delay, volume, route safety, etc., by conflating with available traffic data and crash data. The accuracy of the conflation rate achieved through this algorithm was 80-95 percent for the different data sources. The final conflated layer gives detailed information about road networks coupled with traffic parameters like delay, travel time, route safety, travel time reliability, etc.by Neetu ChoubeyIncludes bibliographical reference

    Automatic Geospatial Data Conflation Using Semantic Web Technologies

    Get PDF
    Duplicate geospatial data collections and maintenance are an extensive problem across Australia government organisations. This research examines how Semantic Web technologies can be used to automate the geospatial data conflation process. The research presents a new approach where generation of OWL ontologies based on output data models and presenting geospatial data as RDF triples serve as the basis for the solution and SWRL rules serve as the core to automate the geospatial data conflation processes

    Historical collaborative geocoding

    Full text link
    The latest developments in digital have provided large data sets that can increasingly easily be accessed and used. These data sets often contain indirect localisation information, such as historical addresses. Historical geocoding is the process of transforming the indirect localisation information to direct localisation that can be placed on a map, which enables spatial analysis and cross-referencing. Many efficient geocoders exist for current addresses, but they do not deal with the temporal aspect and are based on a strict hierarchy (..., city, street, house number) that is hard or impossible to use with historical data. Indeed historical data are full of uncertainties (temporal aspect, semantic aspect, spatial precision, confidence in historical source, ...) that can not be resolved, as there is no way to go back in time to check. We propose an open source, open data, extensible solution for geocoding that is based on the building of gazetteers composed of geohistorical objects extracted from historical topographical maps. Once the gazetteers are available, geocoding an historical address is a matter of finding the geohistorical object in the gazetteers that is the best match to the historical address. The matching criteriae are customisable and include several dimensions (fuzzy semantic, fuzzy temporal, scale, spatial precision ...). As the goal is to facilitate historical work, we also propose web-based user interfaces that help geocode (one address or batch mode) and display over current or historical topographical maps, so that they can be checked and collaboratively edited. The system is tested on Paris city for the 19-20th centuries, shows high returns rate and is fast enough to be used interactively.Comment: WORKING PAPE

    Using Geographic Relevance (GR) to contextualize structured and unstructured spatial data

    Get PDF
    Geographic relevance is a concept that has been used to improve spatial information retrieval on mobile devices, but the idea of geographic relevance has several potential applications outside of mobile computing. Geographic relevance is used measure how related two spatial entities are using a set of criteria such as distance between features, the semantic similarity of feature names or clustering pattern of features. This thesis examines the use of geographic relevance to organize and filter web based spatial data such as framework data from open data portals and unstructured volunteer geographic information generated from social media or map-based surveys. There are many new users and producers of geographic information and it is unclear to new users which data sets they should use to solve a given problem. Governments and organizations also have access to a growing volume of volunteer geographic information but current models for matching citizen generated information to locations of concern to support filtering and reporting are inadequate. For both problems, there is an opportunity to develop semi-automated solutions using geographic relevance metrics such as topicality, spatial proximity, cluster and co-location. In this thesis, two geographic relevance models were developed using Python and PostgreSQL to measure relevance and identify relationships between structured framework data and unstructured VGI in order to support data organization, retrieval and filtering. This idea was explored through two related case studies and prototype applications. The first study developed a prototype application to retrieve spatial data from open data portals using four geographic relevance criteria which included topicality, proximity, co-location and cluster co-location. The second study developed a prototype application that matches VGI data to authoritative framework data to dynamically summarize and organize unstructured VGI data. This thesis demonstrates two possible approaches for using GR metrics to evaluate spatial relevance between large data sets and individual features. This thesis evaluates the effectiveness of GR metrics for performing spatial relevance analysis and it demonstrates two potential use cases for GR

    Developing tools and models for evaluating geospatial data integration of official and VGI data sources

    Get PDF
    PhD ThesisIn recent years, systems have been developed which enable users to produce, share and update information on the web effectively and freely as User Generated Content (UGC) data (including Volunteered Geographic Information (VGI)). Data quality assessment is a major concern for supporting the accurate and efficient spatial data integration required if VGI is to be used alongside official, formal, usually governmental datasets. This thesis aims to develop tools and models for the purpose of assessing such integration possibilities. Initially, in order to undertake this task, geometrical similarity of formal and informal data was examined. Geometrical analyses were performed by developing specific programme interfaces to assess the positional, linear and polygon shape similarity among reference field survey data (FS); official datasets such as data from Ordnance Survey (OS), UK and General Directorate for Survey (GDS), Iraq agencies; and VGI information such as OpenStreetMap (OSM) datasets. A discussion of the design and implementation of these tools and interfaces is presented. A methodology has been developed to assess such positional and shape similarity by applying different metrics and standard indices such as the National Standard for Spatial Data Accuracy (NSSDA) for positional quality; techniques such as buffering overlays for linear similarity; and application of moments invariant for polygon shape similarity evaluations. The results suggested that difficulties exist for any geometrical integration of OSM data with both bench mark FS and formal datasets, but that formal data is very close to reference datasets. An investigation was carried out into contributing factors such as data sources, feature types and number of data collectors that may affect the geometrical quality of OSM data and consequently affect the integration process of OSM datasets with FS, OS and GDS. Factorial designs were undertaken in this study in order to develop and implement an experiment to discover the effect of these factors individually and the interaction between each of them. The analysis found that data source is the most significant factor that affects the geometrical quality of OSM datasets, and that there are interactions among all these factors at different levels of interaction. This work also investigated the possibility of integrating feature classification of official datasets such as data from OS and GDS geospatial data agencies, and informal datasets such as OSM information. In this context, two different models were developed. The first set of analysis included the evaluation of semantic integration of corresponding feature classifications of compared datasets. The second model was concerned with assessing the ability of XML schema matching of feature classifications of tested datasets. This initially involved a tokenization process in order to split up into single words classifications that were composed of multiple words. Subsequently, encoding feature classifications as XML schema trees was undertaken. The semantic similarity, data type similarity and structural similarity were measured between the nodes of compared schema trees. Once these three similarities had been computed, a weighted combination technique has been adopted in order to obtain the overall similarity. The findings of both sets of analysis were not encouraging as far as the possibility of effectively integrating feature classifications of VGI datasets, such as OSM information, and formal datasets, such as OS and GDS datasets, is concerned.Ministry of Higher Education and Scientific Research, Republic of Iraq

    Interactive, multi-purpose traffic prediction platform using connected vehicles dataset

    Get PDF
    Traffic congestion is a perennial issue because of the increasing traffic demand yet limited budget for maintaining current transportation infrastructure; let alone expanding them. Many congestion management techniques require timely and accurate traffic estimation and prediction. Examples of such techniques include incident management, real-time routing, and providing accurate trip information based on historical data. In this dissertation, a speech-powered traffic prediction platform is proposed, which deploys a new deep learning algorithm for traffic prediction using Connected Vehicles (CV) data. To speed-up traffic forecasting, a Graph Convolution -- Gated Recurrent Unit (GC-GRU) architecture is proposed and analysis of its performance on tabular data is compared to state-of-the-art models. GC-GRU's Mean Absolute Percentage Error (MAPE) was very close to Transformer (3.16 vs 3.12) while achieving the fastest inference time and a six-fold faster training time than Transformer, although Long-Short-Term Memory (LSTM) was the fastest in training. Such improved performance in traffic prediction with a shorter inference time and competitive training time allows the proposed architecture to better cater to real-time applications. This is the first study to demonstrate the advantage of using multiscale approach by combining CV data with conventional sources such as Waze and probe data. CV data was better at detecting short duration, Jam and stand-still incidents and detected them earlier as compared to probe. CV data excelled at detecting minor incidents with a 90 percent detection rate versus 20 percent for probes and detecting them 3 minutes faster. To process the big CV data faster, a new algorithm is proposed to extract the spatial and temporal features from the CSV files into a Multiscale Data Analysis (MDA). The algorithm also leverages Graphics Processing Unit (GPU) using the Nvidia Rapids framework and Dask parallel cluster in Python. The results show a seventy-fold speedup in the data Extract, Transform, Load (ETL) of the CV data for the State of Missouri of an entire day for all the unique CV journeys (reducing the processing time from about 48 hours to 25 minutes). The processed data is then fed into a customized UNet model that learns highlevel traffic features from network-level images to predict large-scale, multi-route, speed and volume of CVs. The accuracy and robustness of the proposed model are evaluated by taking different road types, times of day and image snippets of the developed model and comparable benchmarks. To visually analyze the historical traffic data and the results of the prediction model, an interactive web application powered by speech queries is built to offer accurate and fast insights of traffic performance, and thus, allow for better positioning of traffic control strategies. The product of this dissertation can be seamlessly deployed by transportation authorities to understand and manage congestions in a timely manner.Includes bibliographical references

    A Framework for Quality Evaluation of VGI linear datasets

    Get PDF
    Spatial data collection, processing, distribution and understanding have traditionally been handled by professionals. However, as technology advances, non-experts can now collect Geographic Information (GI), create spatial databases and distribute GI through web applications. This Volunteered Geographic Information (VGI), as it is called, seems to be a promising spatial data source. However, the most concerning issue is its unknown and heterogeneous quality, which cannot be handled by traditional quality measurement methods; the quality elements that these methods measure were standardised long before the appearance of VGI and they assume uniform quality behaviour. The lack of a suitable quality evaluation framework with an appropriate level of automation, which would enable the repetition of the quality assessment when VGI is updated, renders the choice of using it difficult or risky for potential users. This thesis proposes a framework for quality evaluation of linear VGI datasets, used to represent networks. The suggested automated methodology is based on a comparison of a VGI dataset with a dataset of known quality. The heterogeneity issue is handled by producing individual results for small areal units, using a tessellation grid. The quality elements measured are data completeness, attribute and positional accuracy, considered as most important for VGI. Compared to previous research, this thesis includes an automated data matching procedure, specifically designed for VGI. It combines geometric and thematic constraints, shifting the scale of importance from geometry to non-spatial attributes, depending on their existence in the VGI dataset. Based on the data matching results, all quality elements are then measured for corresponding objects, providing a more accurate quality assessment. The method is tested on three case studies. Data matching proves to be quite efficient, leading to more accurate quality results. The data completeness approach also tackles VGI over-completeness, which broadens the method usage for data fusion purposes

    Proceedings of the Third Dutch-Belgian Information Retrieval Workshop (DIR 2002)

    Get PDF

    Copula Based Population Synthesis and Big Data Driven Performance Measurement

    Get PDF
    Transportation agencies all over the country are facing fiscal shortages due to the increasing costs of management and maintenance of facilities. The political reluctance to increase gas taxes, the primary source of revenue for many government transportation agencies, along with the improving fuel efficiency of automobiles sold to consumers, only exacerbate the financial dire straits. The adoption of electric vehicles threatens to completely stop the inflow of money into federal, state and regional agencies. Consequently, expansion of the network and infrastructure is slowly being replaced by a more proactive approach to managing the use of existing facilities. The required insights to manage the network more efficiently is also partly due to a massive increase in the type and volume of available data. These data are paving the way for network-wide Intelligent Transportation Systems (ITS), which promises to maximize utilization of current facilities. The waves of revolutions overtaking the usual business affairs of transportation agencies have prompted the development and application of various analytical tools, models and and procedures to transportation. Contributions to this growth of analysis techniques are documented in this dissertation. There are two main domains of transportation: demand and supply, which need to be simultaneously managed to effectively push towards optimal use of resources, facilities, and to minimize negative impacts like time wasted in delays, environmental pollution, and greenhouse gas emissions. The two domains are quite distinct and require specialized solutions to the problems. This dissertation documents the developed techniques in two sections, addressing the two domains of demand and supply. In the first section, a copula based approach is demonstrated to produce a reliable and accurate synthetic population which is essential to estimate the demand correctly. The second section deals with big data analytics using simple models and fast algorithms to produce results in real-time. The techniques developed target short-term traffic forecasting, linking of multiple disparate datasets to power niche analytics, and quickly computing accurate measures of highway network performance to inform decisions made by facility operators in real-time. The analyses presented in this dissertation target many core aspects of transportation science, and enable the shared goal of providing safe, efficient and equitable service to travelers. Synthetic population in transportation is used primarily to estimate transportation demand from Activity Based Modeling (ABM) framework containing well-fitted behavioral and choice models. It allows accurate verification of the impacts of policies on the travel behavior of people, enabling confident implementation of policies, like setting transit fares or tolls, designed for the common benefit of many. Further accurate demand models allow for resilient and resourceful planning of new or repurposing existing infrastructure and assets. On the other hand, short-term traffic speed predictions and speed based reliable performance measures are key in providing advanced ITS, like real-time route guidance, traveler awareness, and others, geared towards minimizing time, energy and resource wastage, and maximizing user satisfaction. Merging of datasets allow transfer of data such as traffic volumes and speeds between them, allowing computation of the global and network-wide impacts and externalities of transportation, like greenhouse gas emissions, time, energy and resources consumed and wasted in traffic jams, etc
    • …
    corecore