9 research outputs found
Semantics and Validation of Shapes Schemas for RDF
We present a formal semantics and proof of soundness for shapes schemas, an
expressive schema language for RDF graphs that is the foundation of Shape
Expressions Language 2.0. It can be used to describe the vocabulary and the
structure of an RDF graph, and to constrain the admissible properties and
values for nodes in that graph. The language defines a typing mechanism called
shapes against which nodes of the graph can be checked. It includes an
algebraic grouping operator, a choice operator and cardinality constraints for
the number of allowed occurrences of a property. Shapes can be combined using
Boolean operators, and can use possibly recursive references to other shapes.
We describe the syntax of the language and define its semantics. The
semantics is proven to be well-defined for schemas that satisfy a reasonable
syntactic restriction, namely stratified use of negation and recursion. We
present two algorithms for the validation of an RDF graph against a shapes
schema. The first algorithm is a direct implementation of the semantics,
whereas the second is a non-trivial improvement. We also briefly give
implementation guidelines
Verification and Validation of Semantic Annotations
In this paper, we propose a framework to perform verification and validation
of semantically annotated data. The annotations, extracted from websites, are
verified against the schema.org vocabulary and Domain Specifications to ensure
the syntactic correctness and completeness of the annotations. The Domain
Specifications allow checking the compliance of annotations against
corresponding domain-specific constraints. The validation mechanism will detect
errors and inconsistencies between the content of the analyzed schema.org
annotations and the content of the web pages where the annotations were found.Comment: Accepted for the A.P. Ershov Informatics Conference 2019(the PSI
Conference Series, 12th edition) proceedin
Using shape expressions (ShEx) to share rdf data models and to guide curation with rigorous validation
International Conference, European Semantic Web Conference, ESWC (16th. 2019. Portorož, Slovenia
Relational to RDF Data Exchange in Presence of a Shape Expression Schema
International audienceWe study the relational to RDF data exchange problem, where the target constraints are specified using Shape Expression schema (ShEx). We investigate two fundamental problems: 1) consistency which is checking for a given data exchange setting whether there always exists a solution for any source instance, and 2) constructing a universal solution which is a solution that represents the space of all solutions. We propose to use typed IRI constructors in source-to-target tuple generating dependencies to create the IRIs of the RDF graph from the values in the relational instance, and we translate ShEx into a set of target dependencies. We also identify data exchange settings that are key covered, a property that is decidable and guarantees consistency. Furthermore, we show that this property is a sufficient and necessary condition for the existence of universal solutions for a practical subclass of weakly-recursive ShEx
Towards a Toolbox for Automated Assessment of Machine-Actionable Data Management Plans
Most research funders require Data Management Plans (DMPs). The review process can be time consuming, since reviewers read text documents submitted by researchers and provide their feedback. Moreover, it requires specific expert knowledge in data stewardship, which is scarce. Machine-actionable Data Management Plans (maDMPs) and semantic technologies increase the potential for automatic assessment of information contained in DMPs. However, the level of automation and new possibilities are still not well-explored and leveraged. This paper discusses methods for the automation of DMP assessment. It goes beyond generating human-readable reports. It explores how the information contained in maDMPs can be used to provide automated pre-assessment or to fetch further information, allowing reviewers to better judge the content. We map the identified methods to various reviewer goals
Linked data approach in accessing geospatial big data
Today, linked data is frequently associated with Geographic Information Systems (GIS) as its technology stack is utilized in alleviating geospatial data integration issue. Geospatial data have become ubiquitous as they have emerged everywhere and these data can be geo-referenced. One of the types of georeferenced data that is lacking in Malaysia is the insufficient availability of Malaysian oceanographic data. It is a great relief to know that most of the earth observation agencies have granted access into their data obtained from satellite altimetry. Consequently, the exponential growth of geospatial data as well as its complexity and diversity has led to big data problem and caused information sharing and exchange on the web becoming more complicated. To resolve this issue, linked data should be used in handling geospatial big data. Linked data is one of the best practices for exposing, sharing, publishing and connecting the structured data on the web. This study explored linked data as an approach to provide access to the Malaysian physical oceanography datasets on the web, which would allow the data to be standardized in a machine-readable format. The research reviewed the existing software tools used in publishing linked data, identified an appropriate software tool to generate Resource Description Framework (RDF) presenting geographical data and built a physical oceanography data website based on linked data principles. Initially, document analysis was conducted to review the existing linked data tools that have been used for geospatial data. Various scholarly articles, journals, tutorials and web pages were used as references to investigate the use of linked data tools. Based on the review, five software tools, namely Geometry2RDF, TripleGeo, Datalift, OpenLink Virtuoso and KARMA were identified as the appropriate tools to generate the RDF. Each of this software tool has its own capabilities and functionalities. Next, the tools were compared with one another based on literature review to get the best possible tool that can manage georeferenced oceanographic data. After the comparison, this study identified the best software tool to transform the shapefile into the RDF format was Datalift. Finally, a web-based information system was built to publish the linked data to data interlinking and sharing by web users. In conclusion, this study has introduced an alternative way to publish and access geospatial data, particularly related to physical oceanography datasets using linked data principles. Using such an approach would facilitate stakeholders and unveil information within the big data, thus enriching the discovery of geospatial information on the web