Search CORE

266 research outputs found

Logics for Unranked Trees: An Overview

Author: A. Arnold
B. Courcelle
B. Courcelle
B.-H. Schlingloff
E. Clarke
F. Moller
F. Neven
F. Neven
G. Gottlob
G. Gottlob
I. Walukiewicz
J. Niehren
J.R. Büchi
J.W. Thatcher
J.W. Thatcher
L. Cardelli
L. Cardelli
L. Libkin
L. Stockmeyer
M. Benedikt
M. Grohe
M. Rabin
M.Y. Vardi
P. Bouyer
S. Abiteboul
T. Hafer
W. Thomas
Publication venue: 'Logical Methods in Computer Science e.V.'
Publication date: 01/01/2005
Field of study

Labeled unranked trees are used as a model of XML documents, and logical languages for them have been studied actively over the past several years. Such logics have different purposes: some are better suited for extracting data, some for expressing navigational properties, and some make it easy to relate complex properties of trees to the existence of tree automata for those properties. Furthermore, logics differ significantly in their model-checking properties, their automata models, and their behavior on ordered and unordered trees. In this paper we present a survey of logics for unranked trees

arXiv.org e-Print Archive

CiteSeerX

Reasoning About Integrity Constraints for Tree-Structured Data

Author: Czerwinski Wojciech
David Claire
Murlak Filip
Parys Pawel
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 19th International Conference on Database Theory (ICDT 2016)
Publication date: 01/01/2016
Field of study

We study a class of integrity constraints for tree-structured data modelled as data trees, whose nodes have a label from a finite alphabet and store a data value from an infinite data domain. The constraints require each tuple of nodes selected by a conjunctive query (using navigational axes and labels) to satisfy a positive combination of equalities and a positive combination of inequalities over the stored data values. Such constraints are instances of the general framework of XML-to-relational constraints proposed recently by Niewerth and Schwentick. They cover some common classes of constraints, including W3C XML Schema key and unique constraints, as well as domain restrictions and denial constraints, but cannot express inclusion constraints, such as reference keys. Our main result is that consistency of such integrity constraints with respect to a given schema (modelled as a tree automaton) is decidable. An easy extension gives decidability for the entailment problem. Equivalently, we show that validity and containment of unions of conjunctive queries using navigational axes, labels, data equalities and inequalities is decidable, as long as none of the conjunctive queries uses both equalities and inequalities; without this restriction, both problems are known to be undecidable. In the context of XML data exchange, our result can be used to establish decidability for a consistency problem for XML schema mappings. All the decision procedures are doubly exponential, with matching lower bounds. The complexity may be lowered to singly exponential, when conjunctive queries are replaced by tree patterns, and the number of data comparisons is bounded

Dagstuhl Research Online Publication Server

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Validation Framework for RDF-based Constraint Languages

Author: Hartmann Thomas
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2016
Field of study

In this thesis, a validation framework is introduced that enables to consistently execute RDF-based constraint languages on RDF data and to formulate constraints of any type. The framework reduces the representation of constraints to the absolute minimum, is based on formal logics, consists of a small lightweight vocabulary, and ensures consistency regarding validation results and enables constraint transformations for each constraint type across RDF-based constraint languages

End-to-End Entity Resolution for Big Data: A Survey

Author: Christophides Vassilis
Efthymiou Vasilis
Palpanas Themis
Papadakis George
Stefanidis Kostas
Publication venue
Publication date: 01/02/1988
Field of study

One of the most important tasks for improving data quality and the reliability of data analytics results is Entity Resolution (ER). ER aims to identify different descriptions that refer to the same real-world entity, and remains a challenging problem. While previous works have studied specific aspects of ER (and mostly in traditional settings), in this survey, we provide for the first time an end-to-end view of modern ER workflows, and of the novel aspects of entity indexing and matching methods in order to cope with more than one of the Big Data characteristics simultaneously. We present the basic concepts, processing steps and execution strategies that have been proposed by different communities, i.e., database, semantic Web and machine learning, in order to cope with the loose structuredness, extreme diversity, high speed and large scale of entity descriptions used by real-world applications. Finally, we provide a synthetic discussion of the existing approaches, and conclude with a detailed presentation of open research directions

arXiv.org e-Print Archive

University of Richmond

Mining Approximate Keys based on Reasoning from XML Data

Author
Publication venue: 'Deanship of Scientific Research'
Publication date
Field of study