10,040 research outputs found
Formal Properties of XML Grammars and Languages
XML documents are described by a document type definition (DTD). An
XML-grammar is a formal grammar that captures the syntactic features of a DTD.
We investigate properties of this family of grammars. We show that every
XML-language basically has a unique XML-grammar. We give two characterizations
of languages generated by XML-grammars, one is set-theoretic, the other is by a
kind of saturation property. We investigate decidability problems and prove
that some properties that are undecidable for general context-free languages
become decidable for XML-languages. We also characterize those XML-grammars
that generate regular XML-languages.Comment: 24 page
Principles and Implementation of Deductive Parsing
We present a system for generating parsers based directly on the metaphor of
parsing as deduction. Parsing algorithms can be represented directly as
deduction systems, and a single deduction engine can interpret such deduction
systems so as to implement the corresponding parser. The method generalizes
easily to parsers for augmented phrase structure formalisms, such as
definite-clause grammars and other logic grammar formalisms, and has been used
for rapid prototyping of parsing algorithms for a variety of formalisms
including variants of tree-adjoining grammars, categorial grammars, and
lexicalized context-free grammars.Comment: 69 pages, includes full Prolog cod
Pattern matching in compilers
In this thesis we develop tools for effective and flexible pattern matching.
We introduce a new pattern matching system called amethyst. Amethyst is not
only a generator of parsers of programming languages, but can also serve as an
alternative to tools for matching regular expressions.
Our framework also produces dynamic parsers. Its intended use is in the
context of IDE (accurate syntax highlighting and error detection on the fly).
Amethyst offers pattern matching of general data structures. This makes it a
useful tool for implementing compiler optimizations such as constant folding,
instruction scheduling, and dataflow analysis in general.
The parsers produced are essentially top-down parsers. Linear time complexity
is obtained by introducing the novel notion of structured grammars and
regularized regular expressions. Amethyst uses techniques known from compiler
optimizations to produce effective parsers.Comment: master thesi
On the Herbrand content of LK
We present a structural representation of the Herbrand content of LK-proofs
with cuts of complexity prenex Sigma-2/Pi-2. The representation takes the form
of a typed non-deterministic tree grammar of order 2 which generates a finite
language of first-order terms that appear in the Herbrand expansions obtained
through cut-elimination. In particular, for every Gentzen-style reduction
between LK-proofs we study the induced grammars and classify the cases in which
language equality and inclusion hold.Comment: In Proceedings CL&C 2016, arXiv:1606.0582
The Lambek calculus with iteration: two variants
Formulae of the Lambek calculus are constructed using three binary
connectives, multiplication and two divisions. We extend it using a unary
connective, positive Kleene iteration. For this new operation, following its
natural interpretation, we present two lines of calculi. The first one is a
fragment of infinitary action logic and includes an omega-rule for introducing
iteration to the antecedent. We also consider a version with infinite (but
finitely branching) derivations and prove equivalence of these two versions. In
Kleene algebras, this line of calculi corresponds to the *-continuous case. For
the second line, we restrict our infinite derivations to cyclic (regular) ones.
We show that this system is equivalent to a variant of action logic that
corresponds to general residuated Kleene algebras, not necessarily
*-continuous. Finally, we show that, in contrast with the case without division
operations (considered by Kozen), the first system is strictly stronger than
the second one. To prove this, we use a complexity argument. Namely, we show,
using methods of Buszkowski and Palka, that the first system is -hard,
and therefore is not recursively enumerable and cannot be described by a
calculus with finite derivations
- …