9,107 research outputs found
Inferring Chemical Reaction Patterns Using Rule Composition in Graph Grammars
Modeling molecules as undirected graphs and chemical reactions as graph
rewriting operations is a natural and convenient approach tom odeling
chemistry. Graph grammar rules are most naturally employed to model elementary
reactions like merging, splitting, and isomerisation of molecules. It is often
convenient, in particular in the analysis of larger systems, to summarize
several subsequent reactions into a single composite chemical reaction. We use
a generic approach for composing graph grammar rules to define a chemically
useful rule compositions. We iteratively apply these rule compositions to
elementary transformations in order to automatically infer complex
transformation patterns. This is useful for instance to understand the net
effect of complex catalytic cycles such as the Formose reaction. The
automatically inferred graph grammar rule is a generic representative that also
covers the overall reaction pattern of the Formose cycle, namely two carbonyl
groups that can react with a bound glycolaldehyde to a second glycolaldehyde.
Rule composition also can be used to study polymerization reactions as well as
more complicated iterative reaction schemes. Terpenes and the polyketides, for
instance, form two naturally occurring classes of compounds of utmost
pharmaceutical interest that can be understood as "generalized polymers"
consisting of five-carbon (isoprene) and two-carbon units, respectively
Pattern matching in compilers
In this thesis we develop tools for effective and flexible pattern matching.
We introduce a new pattern matching system called amethyst. Amethyst is not
only a generator of parsers of programming languages, but can also serve as an
alternative to tools for matching regular expressions.
Our framework also produces dynamic parsers. Its intended use is in the
context of IDE (accurate syntax highlighting and error detection on the fly).
Amethyst offers pattern matching of general data structures. This makes it a
useful tool for implementing compiler optimizations such as constant folding,
instruction scheduling, and dataflow analysis in general.
The parsers produced are essentially top-down parsers. Linear time complexity
is obtained by introducing the novel notion of structured grammars and
regularized regular expressions. Amethyst uses techniques known from compiler
optimizations to produce effective parsers.Comment: master thesi
An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities
We describe an extension of Earley's parser for stochastic context-free
grammars that computes the following quantities given a stochastic context-free
grammar and an input string: a) probabilities of successive prefixes being
generated by the grammar; b) probabilities of substrings being generated by the
nonterminals, including the entire string being generated by the grammar; c)
most likely (Viterbi) parse of the string; d) posterior expected number of
applications of each grammar production, as required for reestimating rule
probabilities. (a) and (b) are computed incrementally in a single left-to-right
pass over the input. Our algorithm compares favorably to standard bottom-up
parsing methods for SCFGs in that it works efficiently on sparse grammars by
making use of Earley's top-down control structure. It can process any
context-free rule format without conversion to some normal form, and combines
computations for (a) through (d) in a single algorithm. Finally, the algorithm
has simple extensions for processing partially bracketed inputs, and for
finding partial parses and their likelihoods on ungrammatical inputs.Comment: 45 pages. Slightly shortened version to appear in Computational
Linguistics 2
If the Current Clique Algorithms are Optimal, so is Valiant's Parser
The CFG recognition problem is: given a context-free grammar
and a string of length , decide if can be obtained from
. This is the most basic parsing question and is a core computer
science problem. Valiant's parser from 1975 solves the problem in
time, where is the matrix multiplication
exponent. Dozens of parsing algorithms have been proposed over the years, yet
Valiant's upper bound remains unbeaten. The best combinatorial algorithms have
mildly subcubic complexity.
Lee (JACM'01) provided evidence that fast matrix multiplication is needed for
CFG parsing, and that very efficient and practical algorithms might be hard or
even impossible to obtain. Lee showed that any algorithm for a more general
parsing problem with running time can
be converted into a surprising subcubic algorithm for Boolean Matrix
Multiplication. Unfortunately, Lee's hardness result required that the grammar
size be . Nothing was known for the more relevant
case of constant size grammars.
In this work, we prove that any improvement on Valiant's algorithm, even for
constant size grammars, either in terms of runtime or by avoiding the
inefficiencies of fast matrix multiplication, would imply a breakthrough
algorithm for the -Clique problem: given a graph on nodes, decide if
there are that form a clique.
Besides classifying the complexity of a fundamental problem, our reduction
has led us to similar lower bounds for more modern and well-studied cubic time
problems for which faster algorithms are highly desirable in practice: RNA
Folding, a central problem in computational biology, and Dyck Language Edit
Distance, answering an open question of Saha (FOCS'14)
Graphical Reasoning in Compact Closed Categories for Quantum Computation
Compact closed categories provide a foundational formalism for a variety of
important domains, including quantum computation. These categories have a
natural visualisation as a form of graphs. We present a formalism for
equational reasoning about such graphs and develop this into a generic proof
system with a fixed logical kernel for equational reasoning about compact
closed categories. Automating this reasoning process is motivated by the slow
and error prone nature of manual graph manipulation. A salient feature of our
system is that it provides a formal and declarative account of derived results
that can include `ellipses'-style notation. We illustrate the framework by
instantiating it for a graphical language of quantum computation and show how
this can be used to perform symbolic computation.Comment: 21 pages, 9 figures. This is the journal version of the paper
published at AIS
- …