Search CORE

714 research outputs found

The Complexity of SORE-definability Problems

Author: Chen Haiming
Lu Ping
Wu Zhilin
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017)
Publication date: 01/01/2017
Field of study

Single occurrence regular expressions (SORE) are a special kind of deterministic regular expressions, which are extensively used in the schema languages DTD and XSD for XML documents. In this paper, with motivations from the simplification of XML schemas, we consider the SORE-definability problem: Given a regular expression, decide whether it has an equivalent SORE. We investigate extensively the complexity of the SORE-definability problem: We consider both (standard) regular expressions and regular expressions with counting, and distinguish between the alphabets of size at least two and unary alphabets. In all cases, we obtain tight complexity bounds. In addition, we consider another variant of this problem, the bounded SORE-definability problem, which is to decide, given a regular expression E and a number M (encoded in unary or binary), whether there is an SORE, which is equivalent to E on the set of words of length at most M. We show that in several cases, there is an exponential decrease in the complexity when switching from the SORE-definability problem to its bounded variant

Dagstuhl Research Online Publication Server

Earliest Query Answering for Deterministic Nested Word Automata

Author: A. Berlea
A. Neumann
D. Olteanu
G. Miklau
H. Seidl
L. Segoufin
M. Benedikt
M. Grohe
O. Gauwin
O. Gauwin
R. Alur
W. Martens
Publication venue: 'Nordic Pulp and Paper Research Journal'
Publication date: 01/01/2009
Field of study

International audienceEarliest query answering (EQA) is an objective of many recent streaming algorithms for XML query answering, that aim for close to optimal memory management. In this paper, we show that EQA is infeasible even for a small fragment of Forward XPath except if P=NP. We then present an EQA algorithm for queries and schemas defined by deterministic nested word automata (dNWAs) and distinguish a large class of dNWAs for which streaming query answering is feasible in polynomial space and time

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

Efficient asymmetric inclusion of regular expressions with interleaving and counting for XML type-checking

Author: Colazzo Dario
Ghelli Giorgio
Pardini Luca
Sartiani Carlo
Publication venue: ACM
Publication date: 01/01/2013
Field of study

The inclusion of Regular Expressions (REs) is the kernel of any type-checking algorithm for XML manipulation languages. XML applications would benefit from the extension of REs with interleaving and counting, but this is not feasible in general, since inclusion is EXPSPACE-complete for such extended REs. In Colazzo et al. (2009) [1] we introduced a notion of ?conflict-free REs?, which are extended REs with excellent complexity behaviour, including a polynomial inclusion algorithm [1] and linear membership (Ghelli et al., 2008 [2]). Conflict-free REs have interleaving and counting, but the complexity is tamed by the ?conflict-free? limitations, which have been found to be satisfied by the vast majority of the content models published on the Web.However, a type-checking algorithm needs to compare machine-generated subtypes against human-defined supertypes. The conflict-free restriction, while quite harmless for the human-defined supertype, is far too restrictive for the subtype. We show here that the PTIME inclusion algorithm can be actually extended to deal with totally unrestricted REs with counting and interleaving in the subtype position, provided that the supertype is conflict-free.This is exactly the expressive power that we need in order to use subtyping inside type-checking algorithms, and the cost of this generalized algorithm is only quadratic, which is as good as the best algorithm we have for the symmetric case (see [1]). The result is extremely surprising, since we had previously found that symmetric inclusion becomes NP-hard as soon as the candidate subtype is enriched with binary intersection, a generalization that looked much more innocent than what we achieve here

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

Archivio della Ricerca - Università della Basilicata

Archivio della Ricerca - Università di Pisa

PUblication MAnagement

Complexity and Expressiveness of ShEx for RDF

Author: Boneva Iovka
Gayo José Emilio Labra
Hym Samuel
Prud'hommeaux Eric G.
Solbrig Harold R.
Staworko Slawek
Publication venue
Publication date: 01/01/2015
Field of study

International audienceWe study the expressiveness and complexity of Shape Expression Schema (ShEx), a novel schema formalism for RDF currently under development by W3C. ShEx assigns types to the nodes of an RDF graph and allows to constrain the admissible neighborhoods of nodes of a given type with regular bag expressions (RBEs). We formalize and investigate two alternative semantics, multi-and single-type, depending on whether or not a node may have more than one type. We study the expressive power of ShEx and study the complexity of the validation problem. We show that the single-type semantics is strictly more expressive than the multi-type semantics, single-type validation is generally intractable and multi-type validation is feasible for a small (yet practical) subclass of RBEs. To curb the high computational complexity of validation, we propose a natural notion of determinism and show that multi-type validation for the class of deterministic schemas using single-occurrence regular bag expressions (SORBEs) is tractable

INRIA a CCSD electronic archive server

Repositorio Institucional de la Universidad de Oviedo

HAL Descartes

Edinburgh Research Explorer

Dagstuhl Research Online Publication Server

Hal-Diderot

SWI-Prolog and the Web

Author: Gras
Huang
JAN WIELEMAKER
LOURENS VAN DER MEIJ
Mäkelä
Ramakrishnan
Wielemaker
Wielemaker
Wielemaker
ZHISHENG HUANG
Publication venue
Publication date: 06/11/2007
Field of study

Where Prolog is commonly seen as a component in a Web application that is either embedded or communicates using a proprietary protocol, we propose an architecture where Prolog communicates to other components in a Web application using the standard HTTP protocol. By avoiding embedding in external Web servers development and deployment become much easier. To support this architecture, in addition to the transfer protocol, we must also support parsing, representing and generating the key Web document types such as HTML, XML and RDF. This paper motivates the design decisions in the libraries and extensions to Prolog for handling Web documents and protocols. The design has been guided by the requirement to handle large documents efficiently. The described libraries support a wide range of Web applications ranging from HTML and XML documents to Semantic Web RDF processing. To appear in Theory and Practice of Logic Programming (TPLP)Comment: 31 pages, 24 figures and 2 tables. To appear in Theory and Practice of Logic Programming (TPLP

arXiv.org e-Print Archive

VU Research Portal

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

: Méthodes d'Inférence Symbolique pour les Bases de Données

Author: Staworko Slawomir
Publication venue: HAL CCSD
Publication date: 14/12/2015
Field of study

This dissertation is a summary of a line of research, that I wasactively involved in, on learning in databases from examples. Thisresearch focused on traditional as well as novel database models andlanguages for querying, transforming, and describing the schema of adatabase. In case of schemas our contributions involve proposing anoriginal languages for the emerging data models of Unordered XML andRDF. We have studied learning from examples of schemas for UnorderedXML, schemas for RDF, twig queries for XML, join queries forrelational databases, and XML transformations defined with a novelmodel of tree-to-word transducers.Investigating learnability of the proposed languages required us toexamine closely a number of their fundamental properties, often ofindependent interest, including normal forms, minimization,containment and equivalence, consistency of a set of examples, andfinite characterizability. Good understanding of these propertiesallowed us to devise learning algorithms that explore a possibly largesearch space with the help of a diligently designed set ofgeneralization operations in search of an appropriate solution.Learning (or inference) is a problem that has two parameters: theprecise class of languages we wish to infer and the type of input thatthe user can provide. We focused on the setting where the user inputconsists of positive examples i.e., elements that belong to the goallanguage, and negative examples i.e., elements that do not belong tothe goal language. In general using both negative and positiveexamples allows to learn richer classes of goal languages than usingpositive examples alone. However, using negative examples is oftendifficult because together with positive examples they may cause thesearch space to take a very complex shape and its exploration may turnout to be computationally challenging.Ce mémoire est une courte présentation d’une direction de recherche, à laquelle j’ai activementparticipé, sur l’apprentissage pour les bases de données à partir d’exemples. Cette recherches’est concentrée sur les modèles et les langages, aussi bien traditionnels qu’émergents, pourl’interrogation, la transformation et la description du schéma d’une base de données. Concernantles schémas, nos contributions consistent en plusieurs langages de schémas pour les nouveaumodèles de bases de données que sont XML non-ordonné et RDF. Nous avons ainsi étudiél’apprentissage à partir d’exemples des schémas pour XML non-ordonné, des schémas pour RDF,des requêtes twig pour XML, les requêtes de jointure pour bases de données relationnelles et lestransformations XML définies par un nouveau modèle de transducteurs arbre-à-mot.Pour explorer si les langages proposés peuvent être appris, nous avons été obligés d’examinerde près un certain nombre de leurs propriétés fondamentales, souvent souvent intéressantespar elles-mêmes, y compris les formes normales, la minimisation, l’inclusion et l’équivalence, lacohérence d’un ensemble d’exemples et la caractérisation finie. Une bonne compréhension de cespropriétés nous a permis de concevoir des algorithmes d’apprentissage qui explorent un espace derecherche potentiellement très vaste grâce à un ensemble d’opérations de généralisation adapté àla recherche d’une solution appropriée.L’apprentissage (ou l’inférence) est un problème à deux paramètres : la classe précise delangage que nous souhaitons inférer et le type d’informations que l’utilisateur peut fournir. Nousnous sommes placés dans le cas où l’utilisateur fournit des exemples positifs, c’est-à-dire deséléments qui appartiennent au langage cible, ainsi que des exemples négatifs, c’est-à-dire qui n’enfont pas partie. En général l’utilisation à la fois d’exemples positifs et négatifs permet d’apprendredes classes de langages plus riches que l’utilisation uniquement d’exemples positifs. Toutefois,l’utilisation des exemples négatifs est souvent difficile parce que les exemples positifs et négatifspeuvent rendre la forme de l’espace de recherche très complexe, et par conséquent, son explorationinfaisable

Thèses en Ligne

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Automatic Service Composition. Models, Techniques and Tools.

Author: BERARDI Daniela
Publication venue: La Sapienza
Publication date: 10/03/2005
Field of study

Maurizio Lenzerini, Giuseppe De Giacomo, Massimo Mecell

Pubblicazioni Aperte Digitali Interateneo Sapienza

Archivio della ricerca- Università di Roma La Sapienza

Automatic Service Composition. Models, Techniques and Tools.

Author: Berardi Daniela
Publication venue: La Sapienza
Publication date: 10/03/2005
Field of study

Archivio della ricerca- Università di Roma La Sapienza

Logische Grundlagen von Datenbanktransformationen für Datenbanken mit komplexen Typen

Author: Wang Qing
Publication venue
Publication date: 01/01/2010
Field of study

Database transformations consist of queries and updates which are two fundamental types of computations in any databases - the first provides the capability to retrieve data and the second is used to maintain databases in light of ever-changing application domains. With the rising popularity of web-based applications and service-oriented architectures, the development of database transformations must address new challenges, which frequently call for establishing a theoretical framework that unifies both queries and updates over complex-value databases. This dissertation aims to lay down the foundations for establishing a theoretical framework of database transformations in the context of complex-value databases. We shall use an approach that has successfully been used for the characterisation of sequential algorithms. The sequential Abstract State Machine (ASM) thesis captures semantics and behaviour of sequential algorithms. The thesis uses the similarity of general computations and database transformations for characterisation of the later by five postulates: sequential time postulate, abstract state postulate, bounded exploration postulate, background postulate, and the bounded non-determinism postulate. The last two postulates reflect the specific form of transformations for databases. The five postulates exactly capture database transformations. Furthermore, we provide a logical proof system for database transformations that is sound and complete.Datenbanktransformationen sind Anfragen an ein Datenbanksystem oder Modifikationen der Daten des Datenbanksystemes. Diese beiden grundlegenden Arten von Berechnungen auf Datenbanksystemen erlauben zum einem den Zugriff auf Daten und zum anderen die Pflege der Datenbank. Eine theoretische Fundierung von Datenbanktransformationen muss so flexibel sein, dass auch neue web-basierten Anwendungen und den neuen serviceorientierte Architekturen reflektiert sind, sowie auch die komplexeren Datenstrukturen. Diese Dissertation legt die Grundlagen für eine Theoriefundierung durch Datenbanktransformationen, die auch komplexe Datenstrukturen unterstützen. Wir greifen dabei auf einen Zugang zurück, der eine Theorie der sequentiellen Algorithmen bietet. Die sequentielle ASM-These (abstrakte Zustandsmaschinen) beschreibt die Semantik und das Verhalten sequentieller Algorithmen. Die Dissertation nutzt dabei die Gleichartigkeit von allgemeinen Berechnungen und Datenbanktransformationen zur Charakterisierung durch fünf Postulate bzw. Axiome: das Axiom der sequentiellen Ausführung, das Axiom einer abstrakten Charakterisierbarkeit von Zuständen, das Axiom der Begrenzbarkeit von Zustandsänderungen und Zustandssicht, das Axiom der Strukturierung von Datenbanken und das Axiom der Begrenzbarkeit des Nichtdeterminismus. Die letzten beiden Axiome reflektieren die spezifische Seite der Datenbankberechnungen. Die fünf Axiome beschreiben vollständig das Verhalten von Datenbanktransformationen. Weiterhin wird eine Beweiskalkül für Datenbanktransformationen entwickelt, der vollständig und korrekt ist

MACAU: Open Access Repository of Kiel University