63 research outputs found

    Structural characterizations of the navigational expressiveness of relation algebras on a tree

    Full text link
    Given a document D in the form of an unordered node-labeled tree, we study the expressiveness on D of various basic fragments of XPath, the core navigational language on XML documents. Working from the perspective of these languages as fragments of Tarski's relation algebra, we give characterizations, in terms of the structure of D, for when a binary relation on its nodes is definable by an expression in these algebras. Since each pair of nodes in such a relation represents a unique path in D, our results therefore capture the sets of paths in D definable in each of the fragments. We refer to this perspective on language semantics as the "global view." In contrast with this global view, there is also a "local view" where one is interested in the nodes to which one can navigate starting from a particular node in the document. In this view, we characterize when a set of nodes in D can be defined as the result of applying an expression to a given node of D. All these definability results, both in the global and the local view, are obtained by using a robust two-step methodology, which consists of first characterizing when two nodes cannot be distinguished by an expression in the respective fragments of XPath, and then bootstrapping these characterizations to the desired results.Comment: 58 Page

    Model Theory of XPath on Data Trees: Part I: Bisimulation and Characterization

    Get PDF
    We investigate model theoretic properties of XPath with data (in)equality tests over the class of data trees, i.e., the class of trees where each node contains a label from a finite alphabet and a data value from an infinite domain.We provide notions of (bi)simulations for XPath logics containing the child, descendant, parent and ancestor axes to navigate the tree. We show that these notions precisely characterize the equivalence relation associated with each logic. We study formula complexity measures consisting of the number of nested axes and nested subformulas in a formula; these notions are akin to the notion of quantifier rank in first-order logic. We show char- acterization results for fine grained notions of equivalence and (bi)simulation that take into account these complexity measures. We also prove that positive fragments of these logics correspond to the formulas preserved under (non-symmetric) simulations. We show that the logic including the child axis is equivalent to the fragment of first-order logic invariant under the corresponding notion of bisimulation. If upward navigation is allowed the characterization fails but a weaker result can still be established. These results hold both over the class of possibly infinite data trees and over the class of finite data trees.Besides their intrinsic theoretical value, we argue that bisimulations are useful tools to prove (non)expressivity results for the logics studied here, and we substantiate this claim with examples.Fil: Figueira, Diego. Centre National de la Recherche Scientifique; FranciaFil: Figueira, Santiago. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; ArgentinaFil: Areces, Carlos Eduardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria; Argentina. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentin

    Axiomatizing the logical core of XPath 2.0

    Get PDF

    Axiomatizing hybrid xpath with data

    Get PDF
    In this paper we introduce sound and strongly complete axiomatizations for XPath with data constraints extended with hybrid operators. First, we present HXPath=, a multi-modal version of XPath with data, extended with nominals and the hybrid operator @. Then, we introduce an axiomatic system for HXPath=, and we prove it is strongly complete with respect to the class of abstract data models, i.e., data models in which data values are abstracted as equivalence relations. We prove a general completeness result similar to the one presented in, e.g., [BtC06], that ensures that certain extensions of the axiomatic system we introduce are also complete. The axiomatic systems that can be obtained in this way cover a large family of hybrid XPath languages over different classes of frames, for which we present concrete examples. In addition, we investigate axiomatizations over the class of tree models, structures widely used in practice. We show that a strongly complete, finitary, first-order axiomatization of hybrid XPath over trees does not exist, and we propose two alternatives to deal with this issue. We finally introduce filtrations to investigate the status of decidability of the satisfiability problem for these languages.Fil: Areces, Carlos Eduardo. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física. Sección Ciencias de la Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba; ArgentinaFil: Fervari, Raul Alberto. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física. Sección Ciencias de la Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba; Argentin

    Similarity and bisimilarity notions appropriate for characterizing indistinguishability in fragments of the calculus of relations

    Full text link
    Motivated by applications in databases, this paper considers various fragments of the calculus of binary relations. The fragments are obtained by leaving out, or keeping in, some of the standard operators, along with some derived operators such as set difference, projection, coprojection, and residuation. For each considered fragment, a characterization is obtained for when two given binary relational structures are indistinguishable by expressions in that fragment. The characterizations are based on appropriately adapted notions of simulation and bisimulation.Comment: 36 pages, Journal of Logic and Computation 201

    Axiomatizing the Logical Core of XPath 2.0

    Get PDF

    DescribeX: A Framework for Exploring and Querying XML Web Collections

    Full text link
    This thesis introduces DescribeX, a powerful framework that is capable of describing arbitrarily complex XML summaries of web collections, providing support for more efficient evaluation of XPath workloads. DescribeX permits the declarative description of document structure using all axes and language constructs in XPath, and generalizes many of the XML indexing and summarization approaches in the literature. DescribeX supports the construction of heterogeneous summaries where different document elements sharing a common structure can be declaratively defined and refined by means of path regular expressions on axes, or axis path regular expression (AxPREs). DescribeX can significantly help in the understanding of both the structure of complex, heterogeneous XML collections and the behaviour of XPath queries evaluated on them. Experimental results demonstrate the scalability of DescribeX summary refinements and stabilizations (the key enablers for tailoring summaries) with multi-gigabyte web collections. A comparative study suggests that using a DescribeX summary created from a given workload can produce query evaluation times orders of magnitude better than using existing summaries. DescribeX's light-weight approach of combining summaries with a file-at-a-time XPath processor can be a very competitive alternative, in terms of performance, to conventional fully-fledged XML query engines that provide DB-like functionality such as security, transaction processing, and native storage.Comment: PhD thesis, University of Toronto, 2008, 163 page

    Implementation of Web Query Languages Reconsidered

    Get PDF
    Visions of the next generation Web such as the "Semantic Web" or the "Web 2.0" have triggered the emergence of a multitude of data formats. These formats have different characteristics as far as the shape of data is concerned (for example tree- vs. graph-shaped). They are accompanied by a puzzlingly large number of query languages each limited to one data format. Thus, a key feature of the Web, namely to make it possible to access anything published by anyone, is compromised. This thesis is devoted to versatile query languages capable of accessing data in a variety of Web formats. The issue is addressed from three angles: language design, common, yet uniform semantics, and common, yet uniform evaluation. % Thus it is divided in three parts: First, we consider the query language Xcerpt as an example of the advocated class of versatile Web query languages. Using this concrete exemplar allows us to clarify and discuss the vision of versatility in detail. Second, a number of query languages, XPath, XQuery, SPARQL, and Xcerpt, are translated into a common intermediary language, CIQLog. This language has a purely logical semantics, which makes it easily amenable to optimizations. As a side effect, this provides the, to the best of our knowledge, first logical semantics for XQuery and SPARQL. It is a very useful tool for understanding the commonalities and differences of the considered languages. Third, the intermediate logical language is translated into a query algebra, CIQCAG. The core feature of CIQCAG is that it scales from tree- to graph-shaped data and queries without efficiency losses when tree-data and -queries are considered: it is shown that, in these cases, optimal complexities are achieved. CIQCAG is also shown to evaluate each of the aforementioned query languages with a complexity at least as good as the best known evaluation methods so far. For example, navigational XPath is evaluated with space complexity O(q d) and time complexity O(q n) where q is the query size, n the data size, and d the depth of the (tree-shaped) data. CIQCAG is further shown to provide linear time and space evaluation of tree-shaped queries for a larger class of graph-shaped data than any method previously proposed. This larger class of graph-shaped data, called continuous-image graphs, short CIGs, is introduced for the first time in this thesis. A (directed) graph is a CIG if its nodes can be totally ordered in such a manner that, for this order, the children of any node form a continuous interval. CIQCAG achieves these properties by employing a novel data structure, called sequence map, that allows an efficient evaluation of tree-shaped queries, or of tree-shaped cores of graph-shaped queries on any graph-shaped data. While being ideally suited to trees and CIGs, the data structure gracefully degrades to unrestricted graphs. It yields a remarkably efficient evaluation on graph-shaped data that only a few edges prevent from being trees or CIGs

    The Simple Knowledge Organization System (SKOS): a situation report for the HIVE Project

    Get PDF
    HIVE (Helping Interdisciplinary Vocabularies Engineering) es un proyecto financiado por el IMLS (Institute of Museums and Library Services), e indirectamente, en Dryad, ambos proyectos en colaboración del Metadata Research Center y el National Evolutionary Synthesis Center (NESCent) in Durham, North Carolina. Con el desarrollo de HIVE se pretende resolver esta problemática mediante una propuesta de generación automática de metadatos que permita la integración dinámica de vocabularios controlados específicos. Para asistir la integración de vocabularios se seleccionó SKOS (Simple Knowledge Organisation System), un estándar del World Wide Web Consortium (W3C) para la representación de sistemas de organización del conocimiento o vocabularios, como tesauros, esquemas de clasificación, sistemas de encabezamiento de materias y taxonomías, en el marco de la Web Semántica.El presente informe realiza un análisis exhaustivo de la situación en cuanto a la aplicación de SKOS. El estudio incluye una detallada revisión de literatura científica y recursos web sobre el modelo, una selección de los proyectos, iniciativas, herramientas, grupos de investigación claves y cualquier otro tipo de información que pudiera ser de relevancia para el logro de los objetivos del proyecto HIVE. Asimismo, se analiza la importancia de SKOS para el logro de la interoperabilidad semántica y se elaboran un conjunto de recomendaciones para los miembros del proyecto HIVE
    corecore