Search CORE

3,626,753 research outputs found

Open Data Quality

Author: Nikiforova Anastasija
Publication venue
Publication date: 09/07/2020
Field of study

The research discusses how (open) data quality could be described, what should be considered developing a data quality management solution and how it could be applied to open data to check its quality. The proposed approach focuses on development of data quality specification which can be executed to get data quality evaluation results, find errors in data and possible problems which must be solved. The proposed approach is applied to several open data sets to evaluate their quality. Open data is very popular, free available for every stakeholder - it is often used to make business decisions. It is important to be sure that this data is trustable and error-free as its quality problems can lead to huge losses.Comment: 10 pages, 3 figures, 13th International Baltic Conference on Databases and Information Systems & The Baltic DB&IS 2018 Doctoral Consortium (Baltic DB&IS 2018) At: Lithuania, Trakai, Volume: 2158. arXiv admin note: substantial text overlap with arXiv:2007.0469

arXiv.org e-Print Archive

Recommended from our members

Panel data and open-ended questions: Understanding perceptions of quality of life

Author: Nolan J
Plagnol AC
Scott J
Publication venue: Twenty-First Century Society
Publication date: 01/06/2009
Field of study

This paper describes the burgeoning interest in quality of life studies and suggests that as well as expert definitions, we need to consider people’s own perceptions of what matters. Using openended questions from the 1997 and 2002 waves of the British Household Panel Survey we analyse both quantitatively and qualitatively how perceptions of quality of life differ for men and women across the life course. Qualitative analysis reveals that key domains such as health, family and finances often refer, not to self, but to others. Longitudinal analysis demonstrates that people’s perceptions of quality of life change over time, particularly before and after important life transitions. Thus our findings challenge overly individualistic and static conceptions of quality of life and reveal quality of life as a process, not a fixed state

City Research Online

Apollo (Cambridge)

Quality Assessment of Linked Datasets using Probabilistic Approximation

Author: A Hogan
AZ Broder
BH Bloom
C Guéret
JS Vitter
P Hitzler
Publication venue
Publication date: 17/03/2015
Field of study

With the increasing application of Linked Open Data, assessing the quality of datasets by computing quality metrics becomes an issue of crucial importance. For large and evolving datasets, an exact, deterministic computation of the quality metrics is too time consuming or expensive. We employ probabilistic techniques such as Reservoir Sampling, Bloom Filters and Clustering Coefficient estimation for implementing a broad set of data quality metrics in an approximate but sufficiently accurate way. Our implementation is integrated in the comprehensive data quality assessment framework Luzzu. We evaluated its performance and accuracy on Linked Open Datasets of broad relevance.Comment: 15 pages, 2 figures, To appear in ESWC 2015 proceeding

arXiv.org e-Print Archive

Crossref

Fraunhofer-ePrints

Open Data Quality Measurement Framework: Definition and Application to Open Government Data

Author: Canova Lorenzo
Iemma Raimondo
Morando Federico
Orozco Minotas Camilo
Torchiano Marco
Vetrò Antonio
Publication venue: 'Elsevier BV'
Publication date: 01/04/2016
Field of study

The diffusion of Open Government Data (OGD) in recent years kept a very fast pace. However, evidence from practitioners shows that disclosing data without proper quality control may jeopardize datasets reuse and negatively affect civic participation. Current approaches to the problem in literature lack of a comprehensive theoretical framework. Moreover, most of the evaluations concentrate on open data platforms, rather than on datasets. In this work, we address these two limitations and set up a framework of indicators to measure the quality of Open Government Data on a series of data quality dimensions at most granular level of measurement. We validated the evaluation framework by applying it to compare two cases of Italian OGD datasets: an internationally recognized good example of OGD, with centralized disclosure and extensive data quality controls, and samples of OGD from decentralized data disclosure (municipalities level), with no possibility of extensive quality controls as in the former case, hence with supposed lower quality. Starting from measurements based on the quality framework, we were able to verify the difference in quality: the measures showed a few common acquired good practices and weaknesses, and a set of discriminating factors that pertain to the type of datasets and the overall approach. On the basis of this evaluation, we also provided technical and policy guidelines to overcome the weaknesses observed in the decentralized release policy, addressing specific quality aspects

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Open Data Quality Evaluation: A Comparative Analysis of Open Data in Latvia

Author: Nikiforova Anastasija
Publication venue: 'University of Latvia'
Publication date: 09/07/2020
Field of study

Nowadays open data is entering the mainstream - it is free available for every stakeholder and is often used in business decision-making. It is important to be sure data is trustable and error-free as its quality problems can lead to huge losses. The research discusses how (open) data quality could be assessed. It also covers main points which should be considered developing a data quality management solution. One specific approach is applied to several Latvian open data sets. The research provides a step-by-step open data sets analysis guide and summarizes its results. It is also shown there could exist differences in data quality depending on data supplier (centralized and decentralized data releases) and, unfortunately, trustable data supplier cannot guarantee data quality problems absence. There are also underlined common data quality problems detected not only in Latvian open data but also in open data of 3 European countries.Comment: 24 pages, 2 tables, 3 figures, Baltic J. Modern Computin

arXiv.org e-Print Archive

Challenges of Open Data Quality : More Than Just License, Format, and Customer Support

Author: Corsar David
Edwards Peter
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 31/03/2017
Field of study

The research described here was supported by the award made by the RCUK Digital Economy programme to the dot.rural Digital Economy Hub, award reference: EP/G066051/1; and by the Innovate UK award reference: 102615.Peer reviewedPostprin

Aberdeen University Research

Crossref

Open Access Institutional Repository at Robert Gordon University

Preliminary results on Ontology-based Open Data Publishing

Author: Cima Gianluca
Publication venue
Publication date: 01/01/2017
Field of study

Despite the current interest in Open Data publishing, a formal and comprehensive methodology supporting an organization in deciding which data to publish and carrying out precise procedures for publishing high-quality data, is still missing. In this paper we argue that the Ontology-based Data Management paradigm can provide a formal basis for a principled approach to publish high quality, semantically annotated Open Data. We describe two main approaches to using an ontology for this endeavor, and then we present some technical results on one of the approaches, called bottom-up, where the specification of the data to be published is given in terms of the sources, and specific techniques allow deriving suitable annotations for interpreting the published data under the light of the ontology

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Quality of metadata in open data portals

Author: Ariza-Lopez Francisco Javier
Lacasta Javier
Nogueras-Iso Javier
Urena-Camara Manuel Antonio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

During the last decade, numerous governmental, educational or cultural institutions have launched Open Data initiatives that have facilitated the access to large volumes of datasets on the web. The main way to disseminate this availability of data has been the deployment of Open Data catalogs exposing metadata of these datasets, which are easily indexed by web search engines. Open Source platforms have facilitated enormously the labor of institutions involved in Open Data initiatives, making the setup of Open Data portals almost a trivial task. However, few approaches have analyzed how precisely metadata describes the associated datasets. Taking into account the existing approaches for analyzing the quality of metadata in the Open Data context and other related domains, this work contributes to the state of the art by extending an ISO 19157 based method for checking the quality of geographic metadata to the context of Open Data metadata. Focusing on metadata models compliant with the Data Catalog Vocabulary proposed by W3C, the proposed extended method has been applied for the evaluation of the Open Data catalog of the Spanish Government. The results have been also compared with those obtained by the Metadata Quality Assessment methodology proposed at the European Data Portal

Repositorio Universidad de Zaragoza