129 research outputs found

    Towards Cleaning-up Open Data Portals: A Metadata Reconciliation Approach

    Full text link
    This paper presents an approach for metadata reconciliation, curation and linking for Open Governamental Data Portals (ODPs). ODPs have been lately the standard solution for governments willing to put their public data available for the society. Portal managers use several types of metadata to organize the datasets, one of the most important ones being the tags. However, the tagging process is subject to many problems, such as synonyms, ambiguity or incoherence, among others. As our empiric analysis of ODPs shows, these issues are currently prevalent in most ODPs and effectively hinders the reuse of Open Data. In order to address these problems, we develop and implement an approach for tag reconciliation in Open Data Portals, encompassing local actions related to individual portals, and global actions for adding a semantic metadata layer above individual portals. The local part aims to enhance the quality of tags in a single portal, and the global part is meant to interlink ODPs by establishing relations between tags.Comment: 8 pages,10 Figures - Under Revision for ICSC201

    The value and challenges of providing and accessing Government open data in developing countries: Kenyan context from a citizen’s perspective

    Get PDF
    In the recent years, Governments both in developed and developing countries are embracing open data initiatives as fundament al in facilitating government transparency, accountability, and public participation by making data freely available to the public. In addition, open data serves as an essential cornerstone in supporting technological innovation and economic growth by enabling third parties to develop new kinds of digital applications and services (Gray 2014; Ding et al. 2011; Shadbolt et al. 2012). Despite the rising uptake of such initiatives, little has been written on the experience as well as the skills and knowledge for citizens in open data and technology environments. This paper seeks to fill this gap by presenting unique lessons learnt from the implementation of Kenya’s globally acclaimed Open Data Portal which was launched in July 2011. Kenya forms an interesting study choice as the country was the first developing country in sub-Saharan Africa and the second on the continent after Morocco to develop the portal. The portal, powered by Socrata Inc, aims to make core government developmental, demographic, statistical and expenditure data available for researchers, policymakers, ICT developers and the general public. The varying technological, economical, and cultural differences in Kenya significantly affect access and usage of the portal as seen in wide inequalities in technical expertise, internet access, and extent of use. In addition, there are various system and management challenges inhibiting the utility and ease of interaction of the Portal. These challenges include empty datasets, broken links, obsolete information, and lack of numerous datasets requested by the public which date back to over two years. The authors who are Kenyan citizens explore the challenges and best practices learnt from implementation of Kenya Open Data Portal and discuss from a citizen’s view point these unique and interesting findings and how they relate and contrast to other countries

    GI Systems for public health with an ontology based approach

    Get PDF
    Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.Health is an indispensable attribute of human life. In modern age, utilizing technologies for health is one of the emergent concepts in several applied fields. Computer science, (geographic) information systems are some of the interdisciplinary fields which motivates this thesis. Inspiring idea of the study is originated from a rhetorical disease DbHd: Database Hugging Disorder, defined by Hans Rosling at World Bank Open Data speech in May 2010. The cure of this disease can be offered as linked open data, which contains ontologies for health science, diseases, genes, drugs, GEO species etc. LOD-Linked Open Data provides the systematic application of information by publishing and connecting structured data on the Web. In the context of this study we aimed to reduce boundaries between semantic web and geo web. For this reason a use case data is studied from Valencia CSISP- Research Center of Public Health in which the mortality rates for particular diseases are represented spatio-temporally. Use case data is divided into three conceptual domains (health, spatial, statistical), enhanced with semantic relations and descriptions by following Linked Data Principles. Finally in order to convey complex health-related information, we offer an infrastructure integrating geo web and semantic web. Based on the established outcome, user access methods are introduced and future researches/studies are outlined

    LODNav – An Interactive Visualization of the Linking Open Data Cloud

    Get PDF
    The emergence of the Linking Open Data Cloud (LODC) is an example of the adoption of Linked Data principles and the creation of a Web of Data. There is an increasing amount of information linked across member datasets of the LODC by means of RDF links, yet there is little support for a human to understand which datasets are connected to one another. This research presents a novel approach for understanding these interconnections with the publicly accessible tool LODNav – Linking Open Data Navigator. LODNav provides a visualization metaphor of the LODC by positioning member datasets of the LODC on a world map based on the geographical location of the dataset. This interactive tool aims to provide a dynamic up-to-date visualization of the LODC and allows the extraction of information about the datasets as well as their interconnections as RDF data

    A systematic literature review of open data quality in practice

    Get PDF
    Context: The main objective of open data initiatives is to make information freely available through easily accessible mechanisms and facilitate exploitation. In practice openness should be accompanied with a certain level of trustwor- thiness or guarantees about the quality of data. Traditional data quality is a thoroughly researched field with several benchmarks and frameworks to grasp its dimensions. However, quality assessment in open data is a complicated process as it consists of stakeholders, evaluation of datasets as well as the publishing platform. Objective: In this work, we aim to identify and synthesize various features of open data quality approaches in practice. We applied thematic synthesis to identify the most relevant research problems and quality assessment methodologies. Method: We undertook a systematic literature review to summarize the state of the art on open data quality. The review process starts by developing the review protocol in which all steps, research questions, inclusion and exclusion criteria and analysis procedures are included. The search strategy retrieved 9323 publications from four scientific digital libraries. The selected papers were published between 2005 and 2015. Finally, through a discussion between the authors, 63 paper were included in the final set of selected papers. Results: Open data quality, in general, is a broad concept, and it could apply to multiple areas. There are many quality issues concerning open data hindering their actual usage for real-world applications. The main ones are unstruc- tured metadata, heterogeneity of data formats, lack of accuracy, incompleteness and lack of validation techniques. Furthermore, we collected the existing quality methodologies from selected papers and synthesized under a unifying classification schema. Also, a list of quality dimensions and metrics from selected paper is reported. Conclusion: In this research, we provided an overview of the methods related to open data quality, using the instru- ment of systematic literature reviews. Open data quality methodologies vary depending on the application domain. Moreover, the majority of studies focus on satisfying specific quality criteria. With metrics based on generalized data attributes a platform can be created to evaluate all possible open dataset. Also, the lack of methodology validation remains a major problem. Studies should focus on validation techniques

    QuerioCity: A Linked Data Platform for Urban Information Management

    Full text link
    Abstract. In this paper, we present QuerioCity, a platform to catalog, index and query highly heterogenous information coming from complex systems, such as cities. A series of challenges are identified: namely, the heterogeneity of the domain and the lack of a common model, the vol-ume of information and the number of data sets, the requirement for a low entry threshold to the system, the diversity of the input data, in terms of format, syntax and update frequency (streams vs static data), and the sensitivity of the information. We propose an approach for incre-mental and continuous integration of static and streaming data, based on Semantic Web technologies. The proposed system is unique in the literature in terms of handling of multiple integrations of available data sets in combination with flexible provenance tracking, privacy protection and continuous integration of streams. We report on lessons learnt from building the first prototype for Dublin.

    Spatial Search Strategies for Open Government Data: A Systematic Comparison

    Full text link
    The increasing availability of open government datasets on the Web calls for ways to enable their efficient access and searching. There is however an overall lack of understanding regarding spatial search strategies which would perform best in this context. To address this gap, this work has assessed the impact of different spatial search strategies on performance and user relevance judgment. We harvested machine-readable spatial datasets and their metadata from three English-based open government data portals, performed metadata enhancement, developed a prototype and performed both a theoretical and user-based evaluation. The results highlight that (i) switching between area of overlap and Hausdorff distance for spatial similarity computation does not have any substantial impact on performance; and (ii) the use of Hausdorff distance induces slightly better user relevance ratings than the use of area of overlap. The data collected and the insights gleaned may serve as a baseline against which future work can compare.Comment: Paper accepted to GIR'19: 13th Workshop on Geographic Information Retrieval (Lyon, France

    Distribution and Process of Environmental Inequity in the Brazos Valley, Texas

    Get PDF
    Lower income and minority communities have long borne an unequal burden of toxic pollution from environmental hazards. I examined environmental inequity, the unequal distribution of environmental hazards in minority and economically disadvantaged communities and the exclusion of community members from environmental decision making, in Brazos Valley, Texas. This project offers a broad review of unequal environmental burdens and marginalization of minority communities as a background to better understand problems in Central Texas. Geographical Information System (GIS) analysis were used to examine the distribution of potential environmental exposures in Brazos Valley, while qualitative methods assessed the role of a case study community (Bryan, Texas) in the environmental decision-making processes related to these risks
    • …
    corecore