91 research outputs found

    Bounded repairability for regular tree languages

    Get PDF
    We study the problem of bounded repairability of a given restriction tree language R into a target tree language T. More precisely, we say that R is bounded repairable w.r.t. T if there exists a bound on the number of standard tree editing operations necessary to apply to any tree in R in order to obtain a tree in T. We consider a number of possible specifications for tree languages: bottom-up tree automata (on curry encoding of unranked trees) that capture the class of XML Schemas and DTDs. We also consider a special case when the restriction language R is universal, i.e., contains all trees over a given alphabet. We give an effective characterization of bounded repairability between pairs of tree languages represented with automata. This characterization introduces two tools, synopsis trees and a coverage relation between them, allowing one to reason about tree languages that undergo a bounded number of editing operations. We then employ this characterization to provide upper bounds to the complexity of deciding bounded repairability and we show that these bounds are tight. In particular, when the input tree languages are specified with arbitrary bottom-up automata, the problem is coNEXPTIME-complete. The problem remains coNEXPTIME-complete even if we use deterministic non-recursive DTDs to specify the input languages. The complexity of the problem can be reduced if we assume that the alphabet, the set of node labels, is fixed: the problem becomes PSPACE-complete for non-recursive DTDs and coNP-complete for deterministic non-recursive DTDs. Finally, when the restriction tree language R is universal, we show that the bounded repairability problem becomes EXPTIME-complete if the target language is specified by an arbitrary bottom-up tree automaton and becomes tractable (PTIME-complete, in fact) when a deterministic bottom-up automaton is used

    Bounded repairability for regular tree languages

    Get PDF
    International audienceWe consider the problem of repairing unranked trees (e.g., XML documents) satisfying a given restriction specification R (e.g., a DTD) into unranked trees satisfying a given target specification T. Specifically, we focus on the question of whether one can get from any tree in a regular language R to some tree in another regular language T with a finite, uniformly bounded, number of edit operations (i.e., deletions and insertions of nodes). We give effective characterizations of the pairs of specifications R and T for which such a uniform bound exists, and we study the complexity of the problem under different representations of the regular tree languages (e.g., non-deterministic stepwise automata, deterministic stepwise automata, DTDs). Finally, we point out some connections with the analogous problem for regular languages of words

    Which DTDs are streaming bounded repairable?

    Get PDF
    ABSTRACT Integrity constraint management concerns both checking whether data is valid and taking action to restore correctness when invalid data is discovered. In XML the notion of valid data can be captured by schema languages such as Document Type Definitions (DTDs) and more generally XML schemas. DTDs have the property that constraint checking can be done in streaming fashion. In this paper we consider when the corresponding action to restore validity -repair -can be done in streaming fashion. We formalize this as the problem of determining, given a DTD, whether or not a streaming procedure exists that transforms an input document so as to satisfy the DTD, using a number of edits independent of the document. We show that this problem is decidable. In fact, we show the decidability of a more general problem, allowing a more general class of schemas than DTDs, and requiring a repair procedure that works only for documents that are already known to satisfy another class of constraints. The decision procedure relies on a new analysis of the structure of DTDs, reducing to a novel notion of game played on pushdown systems associated with the schemas

    Edit Distance for Pushdown Automata

    Get PDF
    The edit distance between two words w1,w2w_1, w_2 is the minimal number of word operations (letter insertions, deletions, and substitutions) necessary to transform w1w_1 to w2w_2. The edit distance generalizes to languages L1,L2\mathcal{L}_1, \mathcal{L}_2, where the edit distance from L1\mathcal{L}_1 to L2\mathcal{L}_2 is the minimal number kk such that for every word from L1\mathcal{L}_1 there exists a word in L2\mathcal{L}_2 with edit distance at most kk. We study the edit distance computation problem between pushdown automata and their subclasses. The problem of computing edit distance to a pushdown automaton is undecidable, and in practice, the interesting question is to compute the edit distance from a pushdown automaton (the implementation, a standard model for programs with recursion) to a regular language (the specification). In this work, we present a complete picture of decidability and complexity for the following problems: (1)~deciding whether, for a given threshold kk, the edit distance from a pushdown automaton to a finite automaton is at most kk, and (2)~deciding whether the edit distance from a pushdown automaton to a finite automaton is finite.Comment: An extended version of a paper accepted to ICALP 2015 with the same title. The paper has been accepted to the LMCS journa

    IST Austria Technical Report

    Get PDF
    The edit distance between two words w1, w2 is the minimal number of word operations (letter insertions, deletions, and substitutions) necessary to transform w1 to w2. The edit distance generalizes to languages L1, L2, where the edit distance is the minimal number k such that for every word from L1 there exists a word in L2 with edit distance at most k. We study the edit distance computation problem between pushdown automata and their subclasses. The problem of computing edit distance to a pushdown automaton is undecidable, and in practice, the interesting question is to compute the edit distance from a pushdown automaton (the implementation, a standard model for programs with recursion) to a regular language (the specification). In this work, we present a complete picture of decidability and complexity for deciding whether, for a given threshold k, the edit distance from a pushdown automaton to a finite automaton is at most k
    • …
    corecore