9,944 research outputs found

    Representing Dataset Quality Metadata using Multi-Dimensional Views

    Full text link
    Data quality is commonly defined as fitness for use. The problem of identifying quality of data is faced by many data consumers. Data publishers often do not have the means to identify quality problems in their data. To make the task for both stakeholders easier, we have developed the Dataset Quality Ontology (daQ). daQ is a core vocabulary for representing the results of quality benchmarking of a linked dataset. It represents quality metadata as multi-dimensional and statistical observations using the Data Cube vocabulary. Quality metadata are organised as a self-contained graph, which can, e.g., be embedded into linked open datasets. We discuss the design considerations, give examples for extending daQ by custom quality metrics, and present use cases such as analysing data versions, browsing datasets by quality, and link identification. We finally discuss how data cube visualisation tools enable data publishers and consumers to analyse better the quality of their data.Comment: Preprint of a paper submitted to the forthcoming SEMANTiCS 2014, 4-5 September 2014, Leipzig, German

    Continuous Improvement Through Knowledge-Guided Analysis in Experience Feedback

    Get PDF
    Continuous improvement in industrial processes is increasingly a key element of competitiveness for industrial systems. The management of experience feedback in this framework is designed to build, analyze and facilitate the knowledge sharing among problem solving practitioners of an organization in order to improve processes and products achievement. During Problem Solving Processes, the intellectual investment of experts is often considerable and the opportunities for expert knowledge exploitation are numerous: decision making, problem solving under uncertainty, and expert configuration. In this paper, our contribution relates to the structuring of a cognitive experience feedback framework, which allows a flexible exploitation of expert knowledge during Problem Solving Processes and a reuse such collected experience. To that purpose, the proposed approach uses the general principles of root cause analysis for identifying the root causes of problems or events, the conceptual graphs formalism for the semantic conceptualization of the domain vocabulary and the Transferable Belief Model for the fusion of information from different sources. The underlying formal reasoning mechanisms (logic-based semantics) in conceptual graphs enable intelligent information retrieval for the effective exploitation of lessons learned from past projects. An example will illustrate the application of the proposed approach of experience feedback processes formalization in the transport industry sector

    Finding a Repository with the Help of Machine-Actionable DMPs: Opportunities and Challenges

    Get PDF
    Finding a suitable repository to deposit research data is a difficult task for researchers since the landscape consists of thousands of repositories and automated tool support is limited. Machine-actionable DMPs can improve the situation since they contain relevant context information in a structured and machine-friendly way and therefore enable automated support in repository recommendation. This work describes the current practice of repository selection and the available support today. We outline the opportunities and challenges of using machine-actionable DMPs to improve repository recommendation. By linking the use case of repository recommendation to the ten principles for machine-actionable DMPs, we show how this vision can be realized. A filterable and searchable repository registry that provides rich metadata for each indexed repository record is a key element in the architecture described. At the example of repository registries we show that by mapping machine-actionable DMP content and data policy elements to their filter criteria and querying their APIs a ranked list of repositories can be suggested.  [This paper is a conference pre-print presented at IDCC 2020 after lightweight peer review.

    Chemical information matters: an e-Research perspective on information and data sharing in the chemical sciences

    No full text
    Recently, a number of organisations have called for open access to scientific information and especially to the data obtained from publicly funded research, among which the Royal Society report and the European Commission press release are particularly notable. It has long been accepted that building research on the foundations laid by other scientists is both effective and efficient. Regrettably, some disciplines, chemistry being one, have been slow to recognise the value of sharing and have thus been reluctant to curate their data and information in preparation for exchanging it. The very significant increases in both the volume and the complexity of the datasets produced has encouraged the expansion of e-Research, and stimulated the development of methodologies for managing, organising, and analysing "big data". We review the evolution of cheminformatics, the amalgam of chemistry, computer science, and information technology, and assess the wider e-Science and e-Research perspective. Chemical information does matter, as do matters of communicating data and collaborating with data. For chemistry, unique identifiers, structure representations, and property descriptors are essential to the activities of sharing and exchange. Open science entails the sharing of more than mere facts: for example, the publication of negative outcomes can facilitate better understanding of which synthetic routes to choose, an aspiration of the Dial-a-Molecule Grand Challenge. The protagonists of open notebook science go even further and exchange their thoughts and plans. We consider the concepts of preservation, curation, provenance, discovery, and access in the context of the research lifecycle, and then focus on the role of metadata, particularly the ontologies on which the emerging chemical Semantic Web will depend. Among our conclusions, we present our choice of the "grand challenges" for the preservation and sharing of chemical information

    An ontology to standardize research output of nutritional epidemiology : from paper-based standards to linked content

    Get PDF
    Background: The use of linked data in the Semantic Web is a promising approach to add value to nutrition research. An ontology, which defines the logical relationships between well-defined taxonomic terms, enables linking and harmonizing research output. To enable the description of domain-specific output in nutritional epidemiology, we propose the Ontology for Nutritional Epidemiology (ONE) according to authoritative guidance for nutritional epidemiology. Methods: Firstly, a scoping review was conducted to identify existing ontology terms for reuse in ONE. Secondly, existing data standards and reporting guidelines for nutritional epidemiology were converted into an ontology. The terms used in the standards were summarized and listed separately in a taxonomic hierarchy. Thirdly, the ontologies of the nutritional epidemiologic standards, reporting guidelines, and the core concepts were gathered in ONE. Three case studies were included to illustrate potential applications: (i) annotation of existing manuscripts and data, (ii) ontology-based inference, and (iii) estimation of reporting completeness in a sample of nine manuscripts. Results: Ontologies for food and nutrition (n = 37), disease and specific population (n = 100), data description (n = 21), research description (n = 35), and supplementary (meta) data description (n = 44) were reviewed and listed. ONE consists of 339 classes: 79 new classes to describe data and 24 new classes to describe the content of manuscripts. Conclusion: ONE is a resource to automate data integration, searching, and browsing, and can be used to assess reporting completeness in nutritional epidemiology

    Fostering the Adoption of DMP in Small Research Projects through a Collaborative Approach

    Get PDF
    In order to promote sound management of research data the European Commission, under the Horizon 2020 framework program, is promoting the adoption of a Data Management Plan (DMP) in research projects. Despite the value of a DMP to make data findable, accessible, interoperable and reusable (FAIR) through time, the development and implementation of DMPs is not yet a common practice in health research. Raising the awareness of researchers in small projects to the benefits of early adoption of a DMP is, therefore, a motivator for others to follow suit. In this paper we describe an approach to engage researchers in the writing of a DMP, in an ongoing project, FrailSurvey, in which researchers are collecting data through a mobile application for self-assessment of fragility. The case study is supported by interviews, a metadata creation session, as well as the validation of recommendations by researchers. With the outline of our process we also outline tools and services that supported the development of the DMP in this small project, particularly since there were no institutional services available to researcher
    corecore