187 research outputs found

    Lower bounds for the size of deterministic unranked tree automata

    Get PDF
    AbstractTree automata operating on unranked trees use regular languages, called horizontal languages, to define the transitions of the vertical states that define the bottom-up computation of the automaton. It is well known that the deterministic tree automaton with smallest total number of states, that is, number of vertical states and number of states used to define the horizontal languages, is not unique and it is hard to establish lower bounds for the total number of states. By relying on existing bounds for the size of unambiguous finite automata, we give a lower bound for the size blow-up of determinizing a nondeterministic unranked tree automaton. The lower bound improves the earlier known lower bound that was based on an ad hoc construction

    Transformations Between Different Types of Unranked Bottom-Up Tree Automata

    Full text link
    We consider the representational state complexity of unranked tree automata. The bottom-up computation of an unranked tree automaton may be either deterministic or nondeterministic, and further variants arise depending on whether the horizontal string languages defining the transitions are represented by a DFA or an NFA. Also, we consider for unranked tree automata the alternative syntactic definition of determinism introduced by Cristau et al. (FCT'05, Lect. Notes Comput. Sci. 3623, pp. 68-79). We establish upper and lower bounds for the state complexity of conversions between different types of unranked tree automata.Comment: In Proceedings DCFS 2010, arXiv:1008.127

    Operational State Complexity of Deterministic Unranked Tree Automata

    Full text link
    We consider the state complexity of basic operations on tree languages recognized by deterministic unranked tree automata. For the operations of union and intersection the upper and lower bounds of both weakly and strongly deterministic tree automata are obtained. For tree concatenation we establish a tight upper bound that is of a different order than the known state complexity of concatenation of regular string languages. We show that (n+1) ( (m+1)2^n-2^(n-1) )-1 vertical states are sufficient, and necessary in the worst case, to recognize the concatenation of tree languages recognized by (strongly or weakly) deterministic automata with, respectively, m and n vertical states.Comment: In Proceedings DCFS 2010, arXiv:1008.127

    Deterministic Automata for Unordered Trees

    Get PDF
    Automata for unordered unranked trees are relevant for defining schemas and queries for data trees in Json or Xml format. While the existing notions are well-investigated concerning expressiveness, they all lack a proper notion of determinism, which makes it difficult to distinguish subclasses of automata for which problems such as inclusion, equivalence, and minimization can be solved efficiently. In this paper, we propose and investigate different notions of "horizontal determinism", starting from automata for unranked trees in which the horizontal evaluation is performed by finite state automata. We show that a restriction to confluent horizontal evaluation leads to polynomial-time emptiness and universality, but still suffers from coNP-completeness of the emptiness of binary intersections. Finally, efficient algorithms can be obtained by imposing an order of horizontal evaluation globally for all automata in the class. Depending on the choice of the order, we obtain different classes of automata, each of which has the same expressiveness as CMso.Comment: In Proceedings GandALF 2014, arXiv:1408.556

    Minimizing Tree Automata for Unranked Trees

    Get PDF
    International audienceAutomata for unranked trees form a foundation for XML schemas, querying and pattern languages. We study the problem of efficiently minimizing such automata. We start with the unranked tree automata (UTAs) that are standard in database theory, assuming bottom-up determinism and that horizontal recursion is represented by deterministic finite automata. We show that minimal UTAs in that class are not unique and that minimization is NP-hard. We then study more recent automata classes that do allow for polynomial time minimization. Among those, we show that bottom-up deterministic stepwise tree automata yield the most succinct representations

    Bounded repairability for regular tree languages

    Get PDF
    We study the problem of bounded repairability of a given restriction tree language R into a target tree language T. More precisely, we say that R is bounded repairable w.r.t. T if there exists a bound on the number of standard tree editing operations necessary to apply to any tree in R in order to obtain a tree in T. We consider a number of possible specifications for tree languages: bottom-up tree automata (on curry encoding of unranked trees) that capture the class of XML Schemas and DTDs. We also consider a special case when the restriction language R is universal, i.e., contains all trees over a given alphabet. We give an effective characterization of bounded repairability between pairs of tree languages represented with automata. This characterization introduces two tools, synopsis trees and a coverage relation between them, allowing one to reason about tree languages that undergo a bounded number of editing operations. We then employ this characterization to provide upper bounds to the complexity of deciding bounded repairability and we show that these bounds are tight. In particular, when the input tree languages are specified with arbitrary bottom-up automata, the problem is coNEXPTIME-complete. The problem remains coNEXPTIME-complete even if we use deterministic non-recursive DTDs to specify the input languages. The complexity of the problem can be reduced if we assume that the alphabet, the set of node labels, is fixed: the problem becomes PSPACE-complete for non-recursive DTDs and coNP-complete for deterministic non-recursive DTDs. Finally, when the restriction tree language R is universal, we show that the bounded repairability problem becomes EXPTIME-complete if the target language is specified by an arbitrary bottom-up tree automaton and becomes tractable (PTIME-complete, in fact) when a deterministic bottom-up automaton is used

    Dictionary-Based Tree Compression (Invited Talk)

    Get PDF
    Trees are a ubiquitous data structure in computer science. LISP, for instance, was designed to manipulate nested lists, that is, ordered unranked trees. Already at that time, DAGs were used to detect common subexpression, a process known as "hash consing." In a DAG every distinct subtree is represented only once (but can be referenced many times) and hence it constitutes a dictionary-based compression method for ordered trees. In our compression scenario we distinguish two kinds of ordered trees: binary and unranked. The latter appear naturally as representation of XML document structures. We survey these dictionary-based compression methods for ordered trees: (1) DAGs, (2) hybrid DAGs, (3) straight-line context-free tree grammars ("SLT grammars"). We compare the minimal DAG of an unranked tree with the minimal DAG of its binary tree encoding. The latter is obtained by identifying first children of the unranked tree with left children of the binary tree, and next-siblings with the right children. For XML document trees, unranked DAGs are usually smaller than encoded binary DAGs. We show that this holds for arbitrary unranked trees, on average. We also present the "hybrid DAG"; its size lower-bounds those of the binary and unranked DAGs. Finding a smallest SLT grammar for a given tree is NP-complete. We discuss two linear-time approximation algorithms: BPLEX and TreeRePair. For typical XML document trees, TreeRePair produces SLT grammars that are only one fourth of the size of the minimal DAG, and which contain approximately 3$% of the edges of the original tree. As far as we know, this gives rise to the smallest existing pointer-based tree representation. We show that some basic algorithms can be computed directly on the compressed trees, without prior decompression. Examples include the execution of different kinds of tree automata, and the real-time traversal of the original tree. It is even possible to evaluate simple XPath queries directly on the SLT grammars, using deterministic node-selecting tree automata. In this way, impressive speed-ups are achieved over existing XPath evaluators, while at the same time the memory requirement is slashed to only a few percent. For more complex XPath queries that require nondeterministic node-selecting tree automata, efficient evaluation over SLT grammars remains a difficult challenge

    Ramsey Quantifiers over Automatic Structures: {C}omplexity and Applications to Verification

    Get PDF
    • …
    corecore