21 research outputs found
Verification and Validation of Semantic Annotations
In this paper, we propose a framework to perform verification and validation
of semantically annotated data. The annotations, extracted from websites, are
verified against the schema.org vocabulary and Domain Specifications to ensure
the syntactic correctness and completeness of the annotations. The Domain
Specifications allow checking the compliance of annotations against
corresponding domain-specific constraints. The validation mechanism will detect
errors and inconsistencies between the content of the analyzed schema.org
annotations and the content of the web pages where the annotations were found.Comment: Accepted for the A.P. Ershov Informatics Conference 2019(the PSI
Conference Series, 12th edition) proceedin
SWIQA – A SEMANTIC WEB INFORMATION QUALITY ASSESSMENT FRAMEWORK
The internet is currently evolving from the Web of Documents into the Web of Data where data is available on web-scale in the so called Semantic Web (1) to retrieve information or (2) for data reuse, e.g. within applications for a higher degree of automation. At present, there is already a lot of data available on the Semantic Web, but unfortunately we do not know much about their quality due to missing techniques and methodologies for information quality assessment. In this paper, we provide a framework for information quality assessment of Semantic Web data called SWIQA by solely using Semantic Web technologies. Other than survey-based techniques for information quality assessment SWIQA employs data quality rule templates to express quality requirements which are automatically used to identify deficient data and calculate quality scores. Hence, using our approach minimizes manual effort while providing transparency about the quality of Semantic Web data. SWIQA may, therefore, be used by data consumers to find high quality data sources or by data owners to keep track of the quality of their own data
Towards a vocabulary for data quality management in semantic web architectures
The generation of trust in Semantic Web data requires methodologies and techniques to manage the quality of the published data. The definition of “what is good data ” may change depending on the task at hand or the subjective requirements of data owners and data consumers. Many data quality requirements may be modeled using data quality rules, i.e. verifiable aspects that allow the determination of potential data quality problems. In this paper, we provide a vocabulary for the representation of such rules and other quality relevant knowledge with the Resource Description Framework and the Web Ontology Language (RDF/OWL). Based on our vocabulary it is possible to monitor and assess data quality and to automate data cleansing tasks. The use of a standard vocabulary thereby enables the definition of general applicable SPARQL queries and applications for data quality management (DQM). Furthermore, the explicit representation of rules in RDF/OWL facilitates rule management tasks, e.g. for analyzing consistency among the rules, and allows to collaborate and create a shared understanding. Thus, it may help to achieve trust in Semantic Web data