23 research outputs found
A PageRank-based Reputation Model for VGI Data
AbstractQuality of data is one of the key issues in the domain of Volunteered geographic information (VGI). To this purpose, in literature VGI data has been sometime compared with authoritative geospatial data. Evaluation of single contributions to VGI databases is more relevant for some applications and typically relies on evaluating reputation of contributors and using it as proxy measures for data quality. In this paper, we present a novel approach for reputation evaluation that is based on the well known PageRank algorithm for Web pages. We use a simple model for describing different versions of a geospatial entity in terms of corrections and completions. Authors, VGI contributions and their mutual relationships are modeled as nodes of a graph. In order to evaluate reputation of authors and contributions in the graph we propose an algorithm that is based on the personalized version of PageRank
Data trustworthiness and user reputation as indicators of VGI quality
ABSTRACTVolunteered geographic information (VGI) has entered a phase where there are both a substantial amount of crowdsourced information available and a big interest in using it by organizations. But the issue of deciding the quality of VGI without resorting to a comparison with authoritative data remains an open challenge. This article first formulates the problem of quality assessment of VGI data. Then presents a model to measure trustworthiness of information and reputation of contributors by analyzing geometric, qualitative, and semantic aspects of edits over time. An implementation of the model is running on a small data-set for a preliminary empirical validation. The results indicate that the computed trustworthiness provides a valid approximation of VGI quality
Enhancing Data Classification Quality of Volunteered Geographic Information
Geographic data is one of the fundamental components of any Geographic Information System (GIS). Nowadays, the utility of GIS becomes part of everyday life activities, such as searching for a destination, planning a trip, looking for weather information, etc. Without a reliable data source, systems will not provide guaranteed services. In the past, geographic data was collected and processed exclusively by experts and professionals. However, the ubiquity of advanced technology results in the evolution of Volunteered Geographic Information (VGI), when the geographic data is collected and produced by the general public. These changes influence the availability of geographic data, when common people can work together to collect geographic data and produce maps. This particular trend is known as collaborative mapping. In collaborative mapping, the general public shares an online platform to collect, manipulate, and update information about geographic features. OpenStreetMap (OSM) is a prominent example of a collaborative mapping project, which aims to produce a free world map editable and accessible by anyone. During the last decade, VGI has expanded based on the power of crowdsourcing. The involvement of the public in data collection raises great concern about the resulting data quality. There exist various perspectives of geographic data quality this dissertation focuses particularly on the quality of data classification (i.e., thematic accuracy). In professional data collection, data is classified based on quantitative and/or qualitative ob- servations. According to a pre-defined classification model, which is usually constructed by experts, data is assigned to appropriate classes. In contrast, in most collaborative mapping projects data classification is mainly based on individualsa cognition. Through online platforms, contributors collect information about geographic features and trans- form their perceptions into classified entities. In VGI projects, the contributors mostly have limited experience in geography and cartography. Therefore, the acquired data may have a questionable classification quality. This dissertation investigates the challenges of data classification in VGI-based mapping projects (i.e., collaborative mapping projects). In particular, it lists the challenges relevant to the evolution of VGI as well as to the characteristics of geographic data. Furthermore, this work proposes a guiding approach to enhance the data classification quality in such projects. The proposed approach is based on the following premises (i) the availability of large amounts of data, which fosters applying machine learning techniques to extract useful knowledge, (ii) utilization of the extracted knowledge to guide contributors to appropriate data classification, (iii) the humanitarian spirit of contributors to provide precise data, when they are supported by a guidance system, and (iv) the power of crowdsourcing in data collection as well as in ensuring the data quality. This cumulative dissertation consists of five peer-reviewed publications in international conference proceedings and international journals. The publications divide the disser- tation into three parts the first part presents a comprehensive literature review about the relevant previous work of VGI quality assurance procedures (Chapter 2), the second part studies the foundations of the approach (Chapters 3-4), and the third part discusses the proposed approach and provides a validation example for implementing the approach (Chapters 5-6). Furthermore, Chapter 1 presents an overview about the research ques- tions and the adapted research methodology, while Chapter 7 concludes the findings and summarizes the contributions. The proposed approach is validated through empirical studies and an implemented web application. The findings reveal the feasibility of the proposed approach. The output shows that applying the proposed approach results in enhanced data classification quality. Furthermore, the research highlights the demands for intuitive data collection and data interpretation approaches adequate to VGI-based mapping projects. An interaction data collection approach is required to guide the contributors toward enhanced data quality, while an intuitive data interpretation approach is needed to derive more precise information from rich VGI resources
Multi-Dimensional Recommendation Scheme for Social Networks Considering a User Relationship Strength Perspective
Developing a computational method based on user relationship strength for multi-dimensional recommendation is a significant challenge. The traditional recommendation methods have relatively low accuracy because they lack considering information from the perspective of user relationship strength into the recommendation algorithm. User relationship strength reflects the degree of closeness between two users, which can make the recommendation system more efficient between users in pairs. This paper proposes a multi-dimensional comprehensive recommendation method based on user relationship strength. We take three main factors into consideration, including the strength of user relationship, the similarity of entities, and the degree of user interest. First, we introduce a novel method to generate a user candidate set and an entity candidate set by calculating the relationship strength between two users and the similarity between two entities. Then, the algorithm will calculate the user interest degree of each user in the user candidate set to each entity in the entity candidate set, if the user interest degree is larger than or equal to a threshold, this particular entity will be recommended to this user. The performance of the proposed method was verified based on the real-world social network dataset and the e-commerce website dataset, and the experimental result suggests that this method can improve the recommendation accuracy
Using Geographic Relevance (GR) to contextualize structured and unstructured spatial data
Geographic relevance is a concept that has been used to improve spatial information retrieval on mobile devices, but the idea of geographic relevance has several potential applications outside of mobile computing. Geographic relevance is used measure how related two spatial entities are using a set of criteria such as distance between features, the semantic similarity of feature names or clustering pattern of features. This thesis examines the use of geographic relevance to organize and filter web based spatial data such as framework data from open data portals and unstructured volunteer geographic information generated from social media or map-based surveys. There are many new users and producers of geographic information and it is unclear to new users which data sets they should use to solve a given problem. Governments and organizations also have access to a growing volume of volunteer geographic information but current models for matching citizen generated information to locations of concern to support filtering and reporting are inadequate. For both problems, there is an opportunity to develop semi-automated solutions using geographic relevance metrics such as topicality, spatial proximity, cluster and co-location. In this thesis, two geographic relevance models were developed using Python and PostgreSQL to measure relevance and identify relationships between structured framework data and unstructured VGI in order to support data organization, retrieval and filtering. This idea was explored through two related case studies and prototype applications. The first study developed a prototype application to retrieve spatial data from open data portals using four geographic relevance criteria which included topicality, proximity, co-location and cluster co-location. The second study developed a prototype application that matches VGI data to authoritative framework data to dynamically summarize and organize unstructured VGI data. This thesis demonstrates two possible approaches for using GR metrics to evaluate spatial relevance between large data sets and individual features. This thesis evaluates the effectiveness of GR metrics for performing spatial relevance analysis and it demonstrates two potential use cases for GR
Methods and Measures for Analyzing Complex Street Networks and Urban Form
Complex systems have been widely studied by social and natural scientists in
terms of their dynamics and their structure. Scholars of cities and urban
planning have incorporated complexity theories from qualitative and
quantitative perspectives. From a structural standpoint, the urban form may be
characterized by the morphological complexity of its circulation networks -
particularly their density, resilience, centrality, and connectedness. This
dissertation unpacks theories of nonlinearity and complex systems, then
develops a framework for assessing the complexity of urban form and street
networks. It introduces a new tool, OSMnx, to collect street network and other
urban form data for anywhere in the world, then analyze and visualize them.
Finally, it presents a large empirical study of 27,000 street networks,
examining their metric and topological complexity relevant to urban design,
transportation research, and the human experience of the built environment.Comment: PhD thesis (2017), City and Regional Planning, UC Berkele
Localizing the media, locating ourselves: a critical comparative analysis of socio-spatial sorting in locative media platforms (Google AND Flickr 2009-2011)
In this thesis I explore media geocoding (i.e., geotagging or georeferencing),
the process of inscribing the media with geographic information. A process
that enables distinct forms of producing, storing, and distributing information
based on location. Historically, geographic information technologies have
served a biopolitical function producing knowledge of populations. In their
current guise as locative media platforms, these systems build rich
databases of places facilitated by user-generated geocoded media. These
geoindexes render places, and users of these services, this thesis argues,
subject to novel forms of computational modelling and economic capture.
Thus, the possibility of tying information, people and objects to location sets
the conditions to the emergence of new communicative practices as well as
new forms of governmentality (management of populations). This project is
an attempt to develop an understanding of the socio-economic forces and
media regimes structuring contemporary forms of location-aware
communication, by carrying out a comparative analysis of two of the main
current location-enabled platforms: Google and Flickr. Drawing from the
medium-specific approach to media analysis characteristic of the subfield of
Software Studies, together with the methodological apparatus of Cultural
Analytics (data mining and visualization methods), the thesis focuses on
examining how social space is coded and computed in these systems. In
particular, it looks at the databases’ underlying ontologies supporting the
platforms' geocoding capabilities and their respective algorithmic logics. In
the final analysis the thesis argues that the way social space is translated in
the form of POIs (Points of Interest) and business-biased categorizations, as
well as the geodemographical ordering underpinning the way it is computed,
are pivotal if we were to understand what kind of socio-spatial relations are
actualized in these systems, and what modalities of governing urban mobility
are enabled