23 research outputs found

    A PageRank-based Reputation Model for VGI Data

    Get PDF
    AbstractQuality of data is one of the key issues in the domain of Volunteered geographic information (VGI). To this purpose, in literature VGI data has been sometime compared with authoritative geospatial data. Evaluation of single contributions to VGI databases is more relevant for some applications and typically relies on evaluating reputation of contributors and using it as proxy measures for data quality. In this paper, we present a novel approach for reputation evaluation that is based on the well known PageRank algorithm for Web pages. We use a simple model for describing different versions of a geospatial entity in terms of corrections and completions. Authors, VGI contributions and their mutual relationships are modeled as nodes of a graph. In order to evaluate reputation of authors and contributions in the graph we propose an algorithm that is based on the personalized version of PageRank

    Data trustworthiness and user reputation as indicators of VGI quality

    Get PDF
    ABSTRACTVolunteered geographic information (VGI) has entered a phase where there are both a substantial amount of crowdsourced information available and a big interest in using it by organizations. But the issue of deciding the quality of VGI without resorting to a comparison with authoritative data remains an open challenge. This article first formulates the problem of quality assessment of VGI data. Then presents a model to measure trustworthiness of information and reputation of contributors by analyzing geometric, qualitative, and semantic aspects of edits over time. An implementation of the model is running on a small data-set for a preliminary empirical validation. The results indicate that the computed trustworthiness provides a valid approximation of VGI quality

    Enhancing Data Classification Quality of Volunteered Geographic Information

    Get PDF
    Geographic data is one of the fundamental components of any Geographic Information System (GIS). Nowadays, the utility of GIS becomes part of everyday life activities, such as searching for a destination, planning a trip, looking for weather information, etc. Without a reliable data source, systems will not provide guaranteed services. In the past, geographic data was collected and processed exclusively by experts and professionals. However, the ubiquity of advanced technology results in the evolution of Volunteered Geographic Information (VGI), when the geographic data is collected and produced by the general public. These changes influence the availability of geographic data, when common people can work together to collect geographic data and produce maps. This particular trend is known as collaborative mapping. In collaborative mapping, the general public shares an online platform to collect, manipulate, and update information about geographic features. OpenStreetMap (OSM) is a prominent example of a collaborative mapping project, which aims to produce a free world map editable and accessible by anyone. During the last decade, VGI has expanded based on the power of crowdsourcing. The involvement of the public in data collection raises great concern about the resulting data quality. There exist various perspectives of geographic data quality this dissertation focuses particularly on the quality of data classification (i.e., thematic accuracy). In professional data collection, data is classified based on quantitative and/or qualitative ob- servations. According to a pre-defined classification model, which is usually constructed by experts, data is assigned to appropriate classes. In contrast, in most collaborative mapping projects data classification is mainly based on individualsa cognition. Through online platforms, contributors collect information about geographic features and trans- form their perceptions into classified entities. In VGI projects, the contributors mostly have limited experience in geography and cartography. Therefore, the acquired data may have a questionable classification quality. This dissertation investigates the challenges of data classification in VGI-based mapping projects (i.e., collaborative mapping projects). In particular, it lists the challenges relevant to the evolution of VGI as well as to the characteristics of geographic data. Furthermore, this work proposes a guiding approach to enhance the data classification quality in such projects. The proposed approach is based on the following premises (i) the availability of large amounts of data, which fosters applying machine learning techniques to extract useful knowledge, (ii) utilization of the extracted knowledge to guide contributors to appropriate data classification, (iii) the humanitarian spirit of contributors to provide precise data, when they are supported by a guidance system, and (iv) the power of crowdsourcing in data collection as well as in ensuring the data quality. This cumulative dissertation consists of five peer-reviewed publications in international conference proceedings and international journals. The publications divide the disser- tation into three parts the first part presents a comprehensive literature review about the relevant previous work of VGI quality assurance procedures (Chapter 2), the second part studies the foundations of the approach (Chapters 3-4), and the third part discusses the proposed approach and provides a validation example for implementing the approach (Chapters 5-6). Furthermore, Chapter 1 presents an overview about the research ques- tions and the adapted research methodology, while Chapter 7 concludes the findings and summarizes the contributions. The proposed approach is validated through empirical studies and an implemented web application. The findings reveal the feasibility of the proposed approach. The output shows that applying the proposed approach results in enhanced data classification quality. Furthermore, the research highlights the demands for intuitive data collection and data interpretation approaches adequate to VGI-based mapping projects. An interaction data collection approach is required to guide the contributors toward enhanced data quality, while an intuitive data interpretation approach is needed to derive more precise information from rich VGI resources

    Multi-Dimensional Recommendation Scheme for Social Networks Considering a User Relationship Strength Perspective

    Get PDF
    Developing a computational method based on user relationship strength for multi-dimensional recommendation is a significant challenge. The traditional recommendation methods have relatively low accuracy because they lack considering information from the perspective of user relationship strength into the recommendation algorithm. User relationship strength reflects the degree of closeness between two users, which can make the recommendation system more efficient between users in pairs. This paper proposes a multi-dimensional comprehensive recommendation method based on user relationship strength. We take three main factors into consideration, including the strength of user relationship, the similarity of entities, and the degree of user interest. First, we introduce a novel method to generate a user candidate set and an entity candidate set by calculating the relationship strength between two users and the similarity between two entities. Then, the algorithm will calculate the user interest degree of each user in the user candidate set to each entity in the entity candidate set, if the user interest degree is larger than or equal to a threshold, this particular entity will be recommended to this user. The performance of the proposed method was verified based on the real-world social network dataset and the e-commerce website dataset, and the experimental result suggests that this method can improve the recommendation accuracy

    Using Geographic Relevance (GR) to contextualize structured and unstructured spatial data

    Get PDF
    Geographic relevance is a concept that has been used to improve spatial information retrieval on mobile devices, but the idea of geographic relevance has several potential applications outside of mobile computing. Geographic relevance is used measure how related two spatial entities are using a set of criteria such as distance between features, the semantic similarity of feature names or clustering pattern of features. This thesis examines the use of geographic relevance to organize and filter web based spatial data such as framework data from open data portals and unstructured volunteer geographic information generated from social media or map-based surveys. There are many new users and producers of geographic information and it is unclear to new users which data sets they should use to solve a given problem. Governments and organizations also have access to a growing volume of volunteer geographic information but current models for matching citizen generated information to locations of concern to support filtering and reporting are inadequate. For both problems, there is an opportunity to develop semi-automated solutions using geographic relevance metrics such as topicality, spatial proximity, cluster and co-location. In this thesis, two geographic relevance models were developed using Python and PostgreSQL to measure relevance and identify relationships between structured framework data and unstructured VGI in order to support data organization, retrieval and filtering. This idea was explored through two related case studies and prototype applications. The first study developed a prototype application to retrieve spatial data from open data portals using four geographic relevance criteria which included topicality, proximity, co-location and cluster co-location. The second study developed a prototype application that matches VGI data to authoritative framework data to dynamically summarize and organize unstructured VGI data. This thesis demonstrates two possible approaches for using GR metrics to evaluate spatial relevance between large data sets and individual features. This thesis evaluates the effectiveness of GR metrics for performing spatial relevance analysis and it demonstrates two potential use cases for GR

    Methods and Measures for Analyzing Complex Street Networks and Urban Form

    Full text link
    Complex systems have been widely studied by social and natural scientists in terms of their dynamics and their structure. Scholars of cities and urban planning have incorporated complexity theories from qualitative and quantitative perspectives. From a structural standpoint, the urban form may be characterized by the morphological complexity of its circulation networks - particularly their density, resilience, centrality, and connectedness. This dissertation unpacks theories of nonlinearity and complex systems, then develops a framework for assessing the complexity of urban form and street networks. It introduces a new tool, OSMnx, to collect street network and other urban form data for anywhere in the world, then analyze and visualize them. Finally, it presents a large empirical study of 27,000 street networks, examining their metric and topological complexity relevant to urban design, transportation research, and the human experience of the built environment.Comment: PhD thesis (2017), City and Regional Planning, UC Berkele

    Localizing the media, locating ourselves: a critical comparative analysis of socio-spatial sorting in locative media platforms (Google AND Flickr 2009-2011)

    Get PDF
    In this thesis I explore media geocoding (i.e., geotagging or georeferencing), the process of inscribing the media with geographic information. A process that enables distinct forms of producing, storing, and distributing information based on location. Historically, geographic information technologies have served a biopolitical function producing knowledge of populations. In their current guise as locative media platforms, these systems build rich databases of places facilitated by user-generated geocoded media. These geoindexes render places, and users of these services, this thesis argues, subject to novel forms of computational modelling and economic capture. Thus, the possibility of tying information, people and objects to location sets the conditions to the emergence of new communicative practices as well as new forms of governmentality (management of populations). This project is an attempt to develop an understanding of the socio-economic forces and media regimes structuring contemporary forms of location-aware communication, by carrying out a comparative analysis of two of the main current location-enabled platforms: Google and Flickr. Drawing from the medium-specific approach to media analysis characteristic of the subfield of Software Studies, together with the methodological apparatus of Cultural Analytics (data mining and visualization methods), the thesis focuses on examining how social space is coded and computed in these systems. In particular, it looks at the databases’ underlying ontologies supporting the platforms' geocoding capabilities and their respective algorithmic logics. In the final analysis the thesis argues that the way social space is translated in the form of POIs (Points of Interest) and business-biased categorizations, as well as the geodemographical ordering underpinning the way it is computed, are pivotal if we were to understand what kind of socio-spatial relations are actualized in these systems, and what modalities of governing urban mobility are enabled
    corecore