437 research outputs found
Efficient asymmetric inclusion of regular expressions with interleaving and counting for XML type-checking
The inclusion of Regular Expressions (REs) is the kernel of any type-checking algorithm for XML manipulation languages. XML applications would benefit from the extension of REs with interleaving and counting, but this is not feasible in general, since inclusion is EXPSPACE-complete for such extended REs. In Colazzo et al. (2009) [1] we introduced a notion of ?conflict-free REs?, which are extended REs with excellent complexity behaviour, including a polynomial inclusion algorithm [1] and linear membership (Ghelli et al., 2008 [2]). Conflict-free REs have interleaving and counting, but the complexity is tamed by the ?conflict-free? limitations, which have been found to be satisfied by the vast majority of the content models published on the Web.However, a type-checking algorithm needs to compare machine-generated subtypes against human-defined supertypes. The conflict-free restriction, while quite harmless for the human-defined supertype, is far too restrictive for the subtype. We show here that the PTIME inclusion algorithm can be actually extended to deal with totally unrestricted REs with counting and interleaving in the subtype position, provided that the supertype is conflict-free.This is exactly the expressive power that we need in order to use subtyping inside type-checking algorithms, and the cost of this generalized algorithm is only quadratic, which is as good as the best algorithm we have for the symmetric case (see [1]). The result is extremely surprising, since we had previously found that symmetric inclusion becomes NP-hard as soon as the candidate subtype is enriched with binary intersection, a generalization that looked much more innocent than what we achieve here
Deciding Definability by Deterministic Regular Expressions
International audienceWe investigate the complexity of deciding whether a given regular language can be defined with a deterministic regular expression. Our main technical result shows that the problem is Pspace-complete if the input language is represented as a regular expression or nondeterministic finite automaton. The problem becomes Expspace-complete if the language is represented as a regular expression with counters
Satisfiability of Downward XPath with Data Equality Tests
International audienceIn this work we investigate the satisfiability problem for the logic XPath(â*, â, =), that includes all downward axes as well as equality and inequality tests. We address this problem in the absence of DTDs and the sibling axis. We prove that this fragment is decidable, and we nail down its complexity, showing the problem to be ExpTime-complete. The result also holds when path expressions allow closure under the Kleene star operator. To obtain these results, we introduce a new automaton model over data trees that captures XPath(â*, â, =) and has an ExpTime emptiness problem. Furthermore, we give the exact complexity of several downward-looking fragments
Expressive Logical Combinators for Free
International audienceA popular technique for the analysis of web query languages relies on the translation of queries into logical formulas. These formulas are then solved for satisfiability using an off-the-shelf satisfiabil-ity solver. A critical aspect in this approach is the size of the obtained logical formula, since it constitutes a factor that affects the combined complexity of the global approach. We present logical combi-nators whose benefit is to provide an exponential gain in succinctness in terms of the size of the logical representation. This opens the way for solving a wide range of problems such as satisfiability and containment for expressive query languages in exponential-time, even though their direct formulation into the underlying logic results in an exponential blowup of the formula size, yielding an incorrectly presumed two-exponential time complexity. We illustrate this from a practical point of view on a few examples such as numerical occurrence constraints and tree frontier properties which are concrete problems found with semi-structured data
Schemas for Unordered XML on a DIME
We investigate schema languages for unordered XML having no relative order
among siblings. First, we propose unordered regular expressions (UREs),
essentially regular expressions with unordered concatenation instead of
standard concatenation, that define languages of unordered words to model the
allowed content of a node (i.e., collections of the labels of children).
However, unrestricted UREs are computationally too expensive as we show the
intractability of two fundamental decision problems for UREs: membership of an
unordered word to the language of a URE and containment of two UREs.
Consequently, we propose a practical and tractable restriction of UREs,
disjunctive interval multiplicity expressions (DIMEs).
Next, we employ DIMEs to define languages of unordered trees and propose two
schema languages: disjunctive interval multiplicity schema (DIMS), and its
restriction, disjunction-free interval multiplicity schema (IMS). We study the
complexity of the following static analysis problems: schema satisfiability,
membership of a tree to the language of a schema, schema containment, as well
as twig query satisfiability, implication, and containment in the presence of
schema. Finally, we study the expressive power of the proposed schema languages
and compare them with yardstick languages of unordered trees (FO, MSO, and
Presburger constraints) and DTDs under commutative closure. Our results show
that the proposed schema languages are capable of expressing many practical
languages of unordered trees and enjoy desirable computational properties.Comment: Theory of Computing System
- âŠ