9 research outputs found

    Semantics and Validation of Shapes Schemas for RDF

    Get PDF
    We present a formal semantics and proof of soundness for shapes schemas, an expressive schema language for RDF graphs that is the foundation of Shape Expressions Language 2.0. It can be used to describe the vocabulary and the structure of an RDF graph, and to constrain the admissible properties and values for nodes in that graph. The language defines a typing mechanism called shapes against which nodes of the graph can be checked. It includes an algebraic grouping operator, a choice operator and cardinality constraints for the number of allowed occurrences of a property. Shapes can be combined using Boolean operators, and can use possibly recursive references to other shapes. We describe the syntax of the language and define its semantics. The semantics is proven to be well-defined for schemas that satisfy a reasonable syntactic restriction, namely stratified use of negation and recursion. We present two algorithms for the validation of an RDF graph against a shapes schema. The first algorithm is a direct implementation of the semantics, whereas the second is a non-trivial improvement. We also briefly give implementation guidelines

    Verification and Validation of Semantic Annotations

    Full text link
    In this paper, we propose a framework to perform verification and validation of semantically annotated data. The annotations, extracted from websites, are verified against the schema.org vocabulary and Domain Specifications to ensure the syntactic correctness and completeness of the annotations. The Domain Specifications allow checking the compliance of annotations against corresponding domain-specific constraints. The validation mechanism will detect errors and inconsistencies between the content of the analyzed schema.org annotations and the content of the web pages where the annotations were found.Comment: Accepted for the A.P. Ershov Informatics Conference 2019(the PSI Conference Series, 12th edition) proceedin

    Relational to RDF Data Exchange in Presence of a Shape Expression Schema

    Get PDF
    International audienceWe study the relational to RDF data exchange problem, where the target constraints are specified using Shape Expression schema (ShEx). We investigate two fundamental problems: 1) consistency which is checking for a given data exchange setting whether there always exists a solution for any source instance, and 2) constructing a universal solution which is a solution that represents the space of all solutions. We propose to use typed IRI constructors in source-to-target tuple generating dependencies to create the IRIs of the RDF graph from the values in the relational instance, and we translate ShEx into a set of target dependencies. We also identify data exchange settings that are key covered, a property that is decidable and guarantees consistency. Furthermore, we show that this property is a sufficient and necessary condition for the existence of universal solutions for a practical subclass of weakly-recursive ShEx

    Towards a Toolbox for Automated Assessment of Machine-Actionable Data Management Plans

    Get PDF
    Most research funders require Data Management Plans (DMPs). The review process can be time consuming, since reviewers read text documents submitted by researchers and provide their feedback. Moreover, it requires specific expert knowledge in data stewardship, which is scarce. Machine-actionable Data Management Plans (maDMPs) and semantic technologies increase the potential for automatic assessment of information contained in DMPs. However, the level of automation and new possibilities are still not well-explored and leveraged. This paper discusses methods for the automation of DMP assessment. It goes beyond generating human-readable reports. It explores how the information contained in maDMPs can be used to provide automated pre-assessment or to fetch further information, allowing reviewers to better judge the content. We map the identified methods to various reviewer goals

    Linked data approach in accessing geospatial big data

    Get PDF
    Today, linked data is frequently associated with Geographic Information Systems (GIS) as its technology stack is utilized in alleviating geospatial data integration issue. Geospatial data have become ubiquitous as they have emerged everywhere and these data can be geo-referenced. One of the types of georeferenced data that is lacking in Malaysia is the insufficient availability of Malaysian oceanographic data. It is a great relief to know that most of the earth observation agencies have granted access into their data obtained from satellite altimetry. Consequently, the exponential growth of geospatial data as well as its complexity and diversity has led to big data problem and caused information sharing and exchange on the web becoming more complicated. To resolve this issue, linked data should be used in handling geospatial big data. Linked data is one of the best practices for exposing, sharing, publishing and connecting the structured data on the web. This study explored linked data as an approach to provide access to the Malaysian physical oceanography datasets on the web, which would allow the data to be standardized in a machine-readable format. The research reviewed the existing software tools used in publishing linked data, identified an appropriate software tool to generate Resource Description Framework (RDF) presenting geographical data and built a physical oceanography data website based on linked data principles. Initially, document analysis was conducted to review the existing linked data tools that have been used for geospatial data. Various scholarly articles, journals, tutorials and web pages were used as references to investigate the use of linked data tools. Based on the review, five software tools, namely Geometry2RDF, TripleGeo, Datalift, OpenLink Virtuoso and KARMA were identified as the appropriate tools to generate the RDF. Each of this software tool has its own capabilities and functionalities. Next, the tools were compared with one another based on literature review to get the best possible tool that can manage georeferenced oceanographic data. After the comparison, this study identified the best software tool to transform the shapefile into the RDF format was Datalift. Finally, a web-based information system was built to publish the linked data to data interlinking and sharing by web users. In conclusion, this study has introduced an alternative way to publish and access geospatial data, particularly related to physical oceanography datasets using linked data principles. Using such an approach would facilitate stakeholders and unveil information within the big data, thus enriching the discovery of geospatial information on the web
    corecore