170,244 research outputs found
Regular Expression Types for XML
We propose regular expression types as a foundation for statically typed XML processing languages. Regular expression types, like most schema languages for XML, introduce regular expression notations such as repetition (*), alternation (|), etc., to describe XML documents. The novelty of our type system is a semantic presentation of subtyping, as inclusion between the sets of documents denoted by two types. We give several examples illustrating the usefulness of this form of subtyping in XML processing.
The decision problem for the subtype relation reduces to the inclusion problem between tree automata, which is known to be EXPTIME-complete. To avoid this high complexity in typical cases, we develop a practical algorithm that, unlike classical algorithms based on determinization of tree automata, checks the inclusion relation by a top-down traversal of the original type expressions. The main advantage of this algorithm is that it can exploit the property that type expressions being compared often share portions of their representations. Our algorithm is a variant of Aiken and Murphy\u27s set-inclusion constraint solver, to which are added several new implementation techniques, correctness proofs, and preliminary performance measurements on some small programs in the domain of typed XML processing
All-order celestial OPE in the MHV sector
On-shell kinematics for gluon scattering can be parametrized with points on
the celestial sphere; in the limit where these points collide, it is known that
tree-level gluon scattering amplitudes exhibit an operator product expansion
(OPE)-like structure. While it is possible to obtain singular contributions to
this celestial OPE, getting regular contributions from both holomorphic and
anti-holomorphic sectors is more difficult. In this paper, we use twistor
string theory to describe the maximal helicity violating (MHV) sector of
tree-level, four-dimensional gluon scattering as an effective 2d conformal
field theory on the celestial sphere. By organizing the OPE between vertex
operators in this theory in terms of soft gluon descendants, we obtain
all-order expressions for the celestial OPE which include all regular
contributions in the collinear expansion. This gives new, all-order formulae
for the collinear splitting function (in momentum space) and celestial OPE
coefficients (in the conformal primary basis) of tree-level MHV gluon
scattering. We obtain these results for both positive and negative helicity
gluons, and for any incoming/outgoing kinematic configuration within the MHV
sector.Comment: 23 pages, no figure
A Counting Logic for Structure Transition Systems
Quantitative questions such as "what is the maximum number of tokens
in a place of a Petri net?" or "what is the maximal reachable height
of the stack of a pushdown automaton?" play a significant role in
understanding models of computation. To study such problems in a
systematic way, we introduce structure transition systems on which
one can define logics that mix temporal expressions (e.g. reachability) with properties of a state (e.g. the height of the stack). We propose a counting logic Qmu[#MSO] which allows to express questions like the ones above, and also many boundedness problems studied so far. We show that Qmu[#MSO] has good algorithmic properties, in particular we generalize two standard methods in model checking, decomposition on trees and model checking through parity games, to this quantitative logic. These properties are used to prove decidability of Qmu[#MSO] on tree-producing pushdown systems, a generalization of both pushdown systems and regular tree grammars
A Tree Logic with Graded Paths and Nominals
Regular tree grammars and regular path expressions constitute core constructs
widely used in programming languages and type systems. Nevertheless, there has
been little research so far on reasoning frameworks for path expressions where
node cardinality constraints occur along a path in a tree. We present a logic
capable of expressing deep counting along paths which may include arbitrary
recursive forward and backward navigation. The counting extensions can be seen
as a generalization of graded modalities that count immediate successor nodes.
While the combination of graded modalities, nominals, and inverse modalities
yields undecidable logics over graphs, we show that these features can be
combined in a tree logic decidable in exponential time
Pattern matching in compilers
In this thesis we develop tools for effective and flexible pattern matching.
We introduce a new pattern matching system called amethyst. Amethyst is not
only a generator of parsers of programming languages, but can also serve as an
alternative to tools for matching regular expressions.
Our framework also produces dynamic parsers. Its intended use is in the
context of IDE (accurate syntax highlighting and error detection on the fly).
Amethyst offers pattern matching of general data structures. This makes it a
useful tool for implementing compiler optimizations such as constant folding,
instruction scheduling, and dataflow analysis in general.
The parsers produced are essentially top-down parsers. Linear time complexity
is obtained by introducing the novel notion of structured grammars and
regularized regular expressions. Amethyst uses techniques known from compiler
optimizations to produce effective parsers.Comment: master thesi
Analyzing Catastrophic Backtracking Behavior in Practical Regular Expression Matching
We develop a formal perspective on how regular expression matching works in
Java, a popular representative of the category of regex-directed matching
engines. In particular, we define an automata model which captures all the
aspects needed to study such matching engines in a formal way. Based on this,
we propose two types of static analysis, which take a regular expression and
tell whether there exists a family of strings which makes Java-style matching
run in exponential time.Comment: In Proceedings AFL 2014, arXiv:1405.527
A Computational Interpretation of Context-Free Expressions
We phrase parsing with context-free expressions as a type inhabitation
problem where values are parse trees and types are context-free expressions. We
first show how containment among context-free and regular expressions can be
reduced to a reachability problem by using a canonical representation of
states. The proofs-as-programs principle yields a computational interpretation
of the reachability problem in terms of a coercion that transforms the parse
tree for a context-free expression into a parse tree for a regular expression.
It also yields a partial coercion from regular parse trees to context-free
ones. The partial coercion from the trivial language of all words to a
context-free expression corresponds to a predictive parser for the expression
- …