Search CORE

143 research outputs found

Simulation Subsumption or Déjà vu on the Web

Author: A. Klug
D. Olteanu
F. Wei
H. Björklund
H. Tamaki
T. Schwentick
W. Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Simulation unification is a special kind of unification adapted to retrieving semi-structured data on the Web. This article introduces simulation subsumption, or containment, that is, query subsumption under simulation unification. Simulation subsumption is crucial in general for query optimization, in particular for optimizing pattern-based search engines, and for the termination of recursive rule-based web languages such as the XML and RDF query language Xcerpt. This paper first motivates and formalizes simulation subsumption. Then, it establishes decidability of simulation subsumption for advanced query patterns featuring descendant constructs, regular expressions, negative subterms (or subterm exclusions), and multiple variable occurrences. Finally, we show that subsumption between two query terms can be decided in O(n!n) where n is the sum of the sizes of both query terms

CiteSeerX

Crossref

Open Access LMU

XPath: Looking Forward

Author: Bry François
Furche Tim
Meuss Holger
Schaffert Dan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

The location path language XPath is of particular importance for XML applications since it is a core component of many XML processing standards such as XSLT or XQuery. In this paper, based on axis symmetry of XPath, equivalences of XPath 1.0 location paths involving reverse axes, such as anc and prec, are established. These equivalences are used as rewriting rules in an algorithm for transforming location paths with reverse axes into equivalent reverse-axis-free ones. Location paths without reverse axes, as generated by the presented rewriting algorithm, enable efficient SAX-like streamed data processing of XPath

CiteSeerX

Open Access LMU

Four Lessons in Versatility or How Query Languages Adapt to the Web

Author: A. Bonifati
A. Gelder van
A. Polleres
A. Polleres
A.C. Klug
B. Adida
B. Cooper
B. Jenner
D. Olteanu
D. Olteanu
D. Recordon
D.D. Chamberlin
D.R. Fulkerson
E. Augurusa
F. Bry
F. Bry
F. Bry
F. Wei
F. Weigel
G. Gottlob
G. Karvounarakis
H. Björklund
H. Garcia-Molina
H. Meuss
H. Meuss
H. Przymusinska
H. Tamaki
H. Wang
H.V. Jagadish
J. Bailey
J. Euzenat
J. Pérez
J. Pérez
J. Pérez
J.D. Ullman
J.J. Carroll
J.V.D. Bussche
K. Kochut
K.A. Ross
K.R. Apt
K.S. Booth
L. Cabibbo
M. Habib
M. Kay
M. Marx
M. Marx
N. Bruno
N. Walsh
P. Boncz
P. Buneman
P. Cholak
P. O’Neil
P.G. Kolaitis
P.P. Schneider
R. Agrawal
R. Fagin
R. Goldman
R. Hull
R. Khare
R. Khare
R. Schenkel
S. Abiteboul
S. Abiteboul
S. Abiteboul
S. Al-Khalifa
S. Berger
S. Groppe
S. Trißl
T. Chen
T. Furche
T. Grust
T. Schwentick
T.C. Przymusinski
U. Assmann
W. Akhtar
W. Chen
W.L. Hsu
W.L. Hsu
Z. Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Exposing not only human-centered information, but machine-processable data on the Web is one of the commonalities of recent Web trends. It has enabled a new kind of applications and businesses where the data is used in ways not foreseen by the data providers. Yet this exposition has fractured the Web into islands of data, each in different Web formats: Some providers choose XML, others RDF, again others JSON or OWL, for their data, even in similar domains. This fracturing stifles innovation as application builders have to cope not only with one Web stack (e.g., XML technology) but with several ones, each of considerable complexity. With Xcerpt we have developed a rule- and pattern based query language that aims to give shield application builders from much of this complexity: In a single query language XML and RDF data can be accessed, processed, combined, and re-published. Though the need for combined access to XML and RDF data has been recognized in previous work (including the W3C’s GRDDL), our approach differs in four main aspects: (1) We provide a single language (rather than two separate or embedded languages), thus minimizing the conceptual overhead of dealing with disparate data formats. (2) Both the declarative (logic-based) and the operational semantics are unified in that they apply for querying XML and RDF in the same way. (3) We show that the resulting query language can be implemented reusing traditional database technology, if desirable. Nevertheless, we also give a unified evaluation approach based on interval labelings of graphs that is at least as fast as existing approaches for tree-shaped XML data, yet provides linear time and space querying also for many RDF graphs. We believe that Web query languages are the right tool for declarative data access in Web applications and that Xcerpt is a significant step towards a more convenient, yet highly efficient data access in a “Web of Data”

CiteSeerX

Crossref

Open Access LMU

Containment of Pattern-Based Queries over Data Trees

Author: David Claire
Gheerbrant Amélie
Libkin Leonid
Martens Wim
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

International audienceWe study static analysis, in particular the containment problem, for analogs of conjunctive queries over XML documents. The problem has been studied for queries based on arbitrary patterns, not necessarily following the tree structure of documents. However, many applications force the syntactic shape of queries to be tree-like, as they are based on proper tree patterns. This renders previous results, crucially based on having non-tree-like features, inapplicable. Thus, we investigate static analysis of queries based on proper tree patterns. We go beyond simple navigational conjunctive queries in two ways: we look at unions and Boolean combinations of such queries as well and, crucially, all our queries handle data stored in documents, i.e., we deal with containment over data trees. We start by giving a general \Pi^p_2 upper bound on the containment of conjunctive queries and Boolean combinations for patterns that involve all types of navigation through documents. We then show matching hardness for conjunctive queries with all navigation, or their Boolean combinations with the simplest form of navigation. After that we look at cases when containment can be witnessed by homomorphisms of analogs of tableaux. These include conjunctive queries and their unions over child and next-sibling axes; however, we show that not all cases of containment can be witnessed by homomorphisms. We look at extending tree patterns used in queries in three possible ways: with wildcard, with schema information, and with data value comparisons. The first one is relatively harmless, the second one tends to increase complexity by an exponential, and the last one quickly leads to undecidability

CiteSeerX

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Containment for queries over trees with attribute value comparisons

Author: Marx M.
Sherkhonov E.
Publication venue: 'Elsevier BV'
Publication date: 01/06/2016
Field of study

International Migration, Integration and Social Cohesion online publications

Relative Expressive Power of Navigational Querying on Graphs

Author: Bussche Dirk
Dimitri Surinx
Fletcher Marc
George H. L
Gyssens Dirk Leinders
Jan Van
Stijn Vansummeren
Van Gucht
Yuqing Wu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

Motivated by both established and new applications, we study navigational query languages for graphs (binary relations). The simplest language has only the two operators union and composition, together with the identity relation. We make more powerful languages by adding any of the following operators: intersection; set difference; projection; coprojection; converse; and the diversity relation. All these operators map binary relations to binary relations. We compare the expressive power of all resulting languages. We do this not only for general path queries (queries where the result may be any binary relation) but also for boolean or yes/no queries (expressed by the nonemptiness of an expression). For both cases, we present the complete Hasse diagram of relative expressiveness. In particular the Hasse diagram for boolean queries contains some nontrivial separations and a few surprising collapses.Comment: An extended abstract announcing the results of this paper was presented at the 14th International Conference on Database Theory, Uppsala, Sweden, March 201

arXiv.org e-Print Archive

: Méthodes d'Inférence Symbolique pour les Bases de Données

Author: Staworko Slawomir
Publication venue: HAL CCSD
Publication date: 14/12/2015
Field of study

This dissertation is a summary of a line of research, that I wasactively involved in, on learning in databases from examples. Thisresearch focused on traditional as well as novel database models andlanguages for querying, transforming, and describing the schema of adatabase. In case of schemas our contributions involve proposing anoriginal languages for the emerging data models of Unordered XML andRDF. We have studied learning from examples of schemas for UnorderedXML, schemas for RDF, twig queries for XML, join queries forrelational databases, and XML transformations defined with a novelmodel of tree-to-word transducers.Investigating learnability of the proposed languages required us toexamine closely a number of their fundamental properties, often ofindependent interest, including normal forms, minimization,containment and equivalence, consistency of a set of examples, andfinite characterizability. Good understanding of these propertiesallowed us to devise learning algorithms that explore a possibly largesearch space with the help of a diligently designed set ofgeneralization operations in search of an appropriate solution.Learning (or inference) is a problem that has two parameters: theprecise class of languages we wish to infer and the type of input thatthe user can provide. We focused on the setting where the user inputconsists of positive examples i.e., elements that belong to the goallanguage, and negative examples i.e., elements that do not belong tothe goal language. In general using both negative and positiveexamples allows to learn richer classes of goal languages than usingpositive examples alone. However, using negative examples is oftendifficult because together with positive examples they may cause thesearch space to take a very complex shape and its exploration may turnout to be computationally challenging.Ce mémoire est une courte présentation d’une direction de recherche, à laquelle j’ai activementparticipé, sur l’apprentissage pour les bases de données à partir d’exemples. Cette recherches’est concentrée sur les modèles et les langages, aussi bien traditionnels qu’émergents, pourl’interrogation, la transformation et la description du schéma d’une base de données. Concernantles schémas, nos contributions consistent en plusieurs langages de schémas pour les nouveaumodèles de bases de données que sont XML non-ordonné et RDF. Nous avons ainsi étudiél’apprentissage à partir d’exemples des schémas pour XML non-ordonné, des schémas pour RDF,des requêtes twig pour XML, les requêtes de jointure pour bases de données relationnelles et lestransformations XML définies par un nouveau modèle de transducteurs arbre-à-mot.Pour explorer si les langages proposés peuvent être appris, nous avons été obligés d’examinerde près un certain nombre de leurs propriétés fondamentales, souvent souvent intéressantespar elles-mêmes, y compris les formes normales, la minimisation, l’inclusion et l’équivalence, lacohérence d’un ensemble d’exemples et la caractérisation finie. Une bonne compréhension de cespropriétés nous a permis de concevoir des algorithmes d’apprentissage qui explorent un espace derecherche potentiellement très vaste grâce à un ensemble d’opérations de généralisation adapté àla recherche d’une solution appropriée.L’apprentissage (ou l’inférence) est un problème à deux paramètres : la classe précise delangage que nous souhaitons inférer et le type d’informations que l’utilisateur peut fournir. Nousnous sommes placés dans le cas où l’utilisateur fournit des exemples positifs, c’est-à-dire deséléments qui appartiennent au langage cible, ainsi que des exemples négatifs, c’est-à-dire qui n’enfont pas partie. En général l’utilisation à la fois d’exemples positifs et négatifs permet d’apprendredes classes de langages plus riches que l’utilisation uniquement d’exemples positifs. Toutefois,l’utilisation des exemples négatifs est souvent difficile parce que les exemples positifs et négatifspeuvent rendre la forme de l’espace de recherche très complexe, et par conséquent, son explorationinfaisable

Thèses en Ligne

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Reasoning & Querying – State of the Art

Author: Bry François
Furche Tim
Weiand Klara
Publication venue
Publication date: 31/08/2008
Field of study

Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF

Open Access LMU

Web and Semantic Web Query Languages

Author: Bailey James
Bry François
Furche Tim
Schaffert Sebastian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

A number of techniques have been developed to facilitate powerful data retrieval on the Web and Semantic Web. Three categories of Web query languages can be distinguished, according to the format of the data they can retrieve: XML, RDF and Topic Maps. This article introduces the spectrum of languages falling into these categories and summarises their salient aspects. The languages are introduced using common sample data and query types. Key aspects of the query languages considered are stressed in a conclusion

Open Access LMU