18 research outputs found

    Shape Expressions Schemas

    Full text link
    We present Shape Expressions (ShEx), an expressive schema language for RDF designed to provide a high-level, user friendly syntax with intuitive semantics. ShEx allows to describe the vocabulary and the structure of an RDF graph, and to constrain the allowed values for the properties of a node. It includes an algebraic grouping operator, a choice operator, cardinalitiy constraints for the number of allowed occurrences of a property, and negation. We define the semantics of the language and illustrate it with examples. We then present a validation algorithm that, given a node in an RDF graph and a constraint defined by the ShEx schema, allows to check whether the node satisfies that constraint. The algorithm outputs a proof that contains trivially verifiable associations of nodes and the constraints that they satisfy. The structure can be used for complex post-processing tasks, such as transforming the RDF graph to other graph or tree structures, verifying more complex constraints, or debugging (w.r.t. the schema). We also show the inherent difficulty of error identification of ShEx

    Complexity and Expressiveness of ShEx for RDF

    Get PDF
    International audienceWe study the expressiveness and complexity of Shape Expression Schema (ShEx), a novel schema formalism for RDF currently under development by W3C. ShEx assigns types to the nodes of an RDF graph and allows to constrain the admissible neighborhoods of nodes of a given type with regular bag expressions (RBEs). We formalize and investigate two alternative semantics, multi-and single-type, depending on whether or not a node may have more than one type. We study the expressive power of ShEx and study the complexity of the validation problem. We show that the single-type semantics is strictly more expressive than the multi-type semantics, single-type validation is generally intractable and multi-type validation is feasible for a small (yet practical) subclass of RBEs. To curb the high computational complexity of validation, we propose a natural notion of determinism and show that multi-type validation for the class of deterministic schemas using single-occurrence regular bag expressions (SORBEs) is tractable

    Linked open drug data for pharmaceutical research and development

    Get PDF
    There is an abundance of information about drugs available on the Web. Data sources range from medicinal chemistry results, over the impact of drugs on gene expression, to the outcomes of drugs in clinical trials. These data are typically not connected together, which reduces the ease with which insights can be gained. Linking Open Drug Data (LODD) is a task force within the World Wide Web Consortium's (W3C) Health Care and Life Sciences Interest Group (HCLS IG). LODD has surveyed publicly available data about drugs, created Linked Data representations of the data sets, and identified interesting scientific and business questions that can be answered once the data sets are connected. The task force provides recommendations for the best practices of exposing data in a Linked Data representation. In this paper, we present past and ongoing work of LODD and discuss the growing importance of Linked Data as a foundation for pharmaceutical R&D data sharing

    FHIR RDF Data Transformation and Validation Framework and Clinical Knowledge Graphs: Towards Explainable AI in Healthcare

    No full text
    HL7 Fast Healthcare Interoperability Resources (FHIR) is rapidly becoming the standards framework for the exchange of electronic health record (EHR) data. By leveraging FHIRs resource-oriented architecture, FHIR RDF stands to become the first main-stream clinical data standard to incorporate the Semantic Web vision. The combination of FHIR, knowledge graphs and the Semantic Web enables a new paradigm to build classification and explainable artificial intelligence (AI) applications in healthcare. The objective of the tutorial is to introduce the FHIR RDF data transformation and validation framework, show how to build clinical knowledge graphs (cKG) in FHIR RDF, and provide the audience with hands-on opportunities on FHIR RDF and cKG tooling. Specifically:Topics regarding the FHIR RDF data transformation and validation framework include: 1. FHIR, and itÂŽs representations FHIR JSON and FHIR RDF; 2. Conversion of FHIR JSON to FHIR RDF (via JSON-LD), use of the FHIR RDF playground, command line tools and HAPI-FHIRÂŽs RDF (Turtle) support; 3. The Shape Expressions (ShEx) schemafor FHIR and its use for validating FHIR data; 4. FHIR structure definitions and their expression as JSON-LD contexts and ShEx schemas; In addition, the FHIR-Ontop-OMOP tool exposes the Observational Medical Outcomes Partnership (OMOP) data as a queryable Knowledge Graph compliant with the HL7 FHIR standard using the Ontop Virtual Knowledge Graph engine. In this tutorial, we demonstrate how to set up Ontop over a working connection to the OMOP PostgreSQL database using a mapping language. Thanks to the virtual approach, the FHIR RDF triples populated by declarative mapping do not need to be materialized. Instead, Ontop translates the SPARQL query over the FHIR RDF model to a SQL query over the OMOP database. We will illustrate the query translation process with some representative phenotype queries. Acknowledgements: This tutorial session is supported in part by the NIH FHIRCat R01 grant (R01 EB030529)

    Validating RDF Data

    No full text
    International audienceRDF and Linked Data have broad applicability across many fields, from aircraft manufacturing to zoology. Requirements for detecting bad data differ across communities, fields, and tasks, but nearly all involve some form of data validation. This book introduces data validation and describes its practical use in day-to-day data exchange.The Semantic Web offers a bold, new take on how to organize, distribute, index, and share data. Using Web addresses (URIs) as identifiers for data elements enables the construction of distributed databases on a global scale. Like the Web, the Semantic Web is heralded as an information revolution, and also like the Web, it is encumbered by data quality issues. The quality of Semantic Web data is compromised by the lack of resources for data curation, for maintenance, and for developing globally applicable data models.At the enterprise scale, these problems have conventional solutions. Master data management provides an enterprise-wide vocabulary, while constraint languages capture and enforce data structures. Filling a need long recognized by Semantic Web users, shapes languages provide models and vocabularies for expressing such structural constraints.This book describes two technologies for RDF validation: Shape Expressions (ShEx) and Shapes Constraint Language (SHACL), the rationales for their designs, a comparison of the two, and some example applications.Table of Contents: Preface / Foreword by Phil Archer / Foreword by Tom Baker / Foreword by Dan Brickley and Libby Miller / Acknowledgments / Introduction / The RDF Ecosystem / Data Quality / Shape Expressions / SHACL / Applications / Comparing ShEx and SHACL / Bibliography / Authors' Biographies / Inde

    Shape Expressions Schemas

    No full text
    We present Shape Expressions (ShEx), an expressive schema language for RDF designed to provide a high-level, user friendly syntax with intuitive semantics. ShEx allows to describe the vocabulary and the structure of an RDF graph, and to constrain the allowed values for the properties of a node. It includes an algebraic grouping operator, a choice operator, cardinalitiy constraints for the number of allowed occurrences of a property, and negation. We define the semantics of the language and illustrate it with examples. We then present a validation algorithm that, given a node in an RDF graph and a constraint defined by the ShEx schema, allows to check whether the node satisfies that constraint. The algorithm outputs a proof that contains trivially verifiable associations of nodes and the constraints that they satisfy. The structure can be used for complex post-processing tasks, such as transforming the RDF graph to other graph or tree structures, verifying more complex constraints, or debugging (w.r.t. the schema). We also show the inherent difficulty of error identification of ShEx

    Creating, maintaining and updating Shape Expressions as EntitySchemas in the Wikimedia ecosystem

    No full text
    Shape Expressions are formal – machine readable – descriptions of data shapes/schemas. They provide the means to validate expectations by both data and use-case providers. In 2019 the Wikidata introduced the EntitySchema namespace that allows storing Shape Expressions in and Wikibase extensions. Next to Wikidata, this EntitySchema namespace is also available to local wikibase installs and on cloud installation such as wbstack.com. In this tutorial we will shortly introduce Shape Expressions after which we will guide the audience through the EntitySchema namespace in both Wikidata and Wikibase. We will also introduce Wikishape (https://wikishape.weso.es/) as a Shape Expression platform provided by Weso. After this tutorial the participants will be able to write simple Shape Expressions and maintain that on either Wikidata or a local Wikibase

    Frugal FAIR Data Point: A Document-Based Implementation for Enhanced Interoperability and Accessibility

    No full text
    This paper presents a novel approach to publishing metadata in a FAIR manner, according to the FAIR Data Point specifications. This approach focuses on a document-based methodology that aligns with stringent infrastructure policies and caters to institutions with limited IT resources. While adhering to the standard FDP specifications, this implementation offers a lean solution that utilizes the filesystem with a lean web server for metadata publication, ensuring broad accessibility and interoperability
    corecore