10,040 research outputs found

    Formal Properties of XML Grammars and Languages

    Full text link
    XML documents are described by a document type definition (DTD). An XML-grammar is a formal grammar that captures the syntactic features of a DTD. We investigate properties of this family of grammars. We show that every XML-language basically has a unique XML-grammar. We give two characterizations of languages generated by XML-grammars, one is set-theoretic, the other is by a kind of saturation property. We investigate decidability problems and prove that some properties that are undecidable for general context-free languages become decidable for XML-languages. We also characterize those XML-grammars that generate regular XML-languages.Comment: 24 page

    Principles and Implementation of Deductive Parsing

    Get PDF
    We present a system for generating parsers based directly on the metaphor of parsing as deduction. Parsing algorithms can be represented directly as deduction systems, and a single deduction engine can interpret such deduction systems so as to implement the corresponding parser. The method generalizes easily to parsers for augmented phrase structure formalisms, such as definite-clause grammars and other logic grammar formalisms, and has been used for rapid prototyping of parsing algorithms for a variety of formalisms including variants of tree-adjoining grammars, categorial grammars, and lexicalized context-free grammars.Comment: 69 pages, includes full Prolog cod

    Pattern matching in compilers

    Get PDF
    In this thesis we develop tools for effective and flexible pattern matching. We introduce a new pattern matching system called amethyst. Amethyst is not only a generator of parsers of programming languages, but can also serve as an alternative to tools for matching regular expressions. Our framework also produces dynamic parsers. Its intended use is in the context of IDE (accurate syntax highlighting and error detection on the fly). Amethyst offers pattern matching of general data structures. This makes it a useful tool for implementing compiler optimizations such as constant folding, instruction scheduling, and dataflow analysis in general. The parsers produced are essentially top-down parsers. Linear time complexity is obtained by introducing the novel notion of structured grammars and regularized regular expressions. Amethyst uses techniques known from compiler optimizations to produce effective parsers.Comment: master thesi

    On the Herbrand content of LK

    Full text link
    We present a structural representation of the Herbrand content of LK-proofs with cuts of complexity prenex Sigma-2/Pi-2. The representation takes the form of a typed non-deterministic tree grammar of order 2 which generates a finite language of first-order terms that appear in the Herbrand expansions obtained through cut-elimination. In particular, for every Gentzen-style reduction between LK-proofs we study the induced grammars and classify the cases in which language equality and inclusion hold.Comment: In Proceedings CL&C 2016, arXiv:1606.0582

    The Lambek calculus with iteration: two variants

    Full text link
    Formulae of the Lambek calculus are constructed using three binary connectives, multiplication and two divisions. We extend it using a unary connective, positive Kleene iteration. For this new operation, following its natural interpretation, we present two lines of calculi. The first one is a fragment of infinitary action logic and includes an omega-rule for introducing iteration to the antecedent. We also consider a version with infinite (but finitely branching) derivations and prove equivalence of these two versions. In Kleene algebras, this line of calculi corresponds to the *-continuous case. For the second line, we restrict our infinite derivations to cyclic (regular) ones. We show that this system is equivalent to a variant of action logic that corresponds to general residuated Kleene algebras, not necessarily *-continuous. Finally, we show that, in contrast with the case without division operations (considered by Kozen), the first system is strictly stronger than the second one. To prove this, we use a complexity argument. Namely, we show, using methods of Buszkowski and Palka, that the first system is Π10\Pi_1^0-hard, and therefore is not recursively enumerable and cannot be described by a calculus with finite derivations
    • …
    corecore