5,579 research outputs found
Analytic aspects of the shuffle product
There exist very lucid explanations of the combinatorial origins of rational
and algebraic functions, in particular with respect to regular and context free
languages. In the search to understand how to extend these natural
correspondences, we find that the shuffle product models many key aspects of
D-finite generating functions, a class which contains algebraic. We consider
several different takes on the shuffle product, shuffle closure, and shuffle
grammars, and give explicit generating function consequences. In the process,
we define a grammar class that models D-finite generating functions
Controlled non uniform random generation of decomposable structures
Consider a class of decomposable combinatorial structures, using different
types of atoms \Atoms = \{\At_1,\ldots ,\At_{|{\Atoms}|}\}. We address the
random generation of such structures with respect to a size and a targeted
distribution in of its \emph{distinguished} atoms. We consider two
variations on this problem. In the first alternative, the targeted distribution
is given by real numbers \TargFreq_1, \ldots, \TargFreq_k such that 0 <
\TargFreq_i < 1 for all and \TargFreq_1+\cdots+\TargFreq_k \leq 1. We
aim to generate random structures among the whole set of structures of a given
size , in such a way that the {\em expected} frequency of any distinguished
atom \At_i equals \TargFreq_i. We address this problem by weighting the
atoms with a -tuple \Weights of real-valued weights, inducing a weighted
distribution over the set of structures of size . We first adapt the
classical recursive random generation scheme into an algorithm taking
\bigO{n^{1+o(1)}+mn\log{n}} arithmetic operations to draw structures from
the \Weights-weighted distribution. Secondly, we address the analytical
computation of weights such that the targeted frequencies are achieved
asymptotically, i. e. for large values of . We derive systems of functional
equations whose resolution gives an explicit relationship between \Weights
and \TargFreq_1, \ldots, \TargFreq_k. Lastly, we give an algorithm in
\bigO{k n^4} for the inverse problem, {\it i.e.} computing the frequencies
associated with a given -tuple \Weights of weights, and an optimized
version in \bigO{k n^2} in the case of context-free languages. This allows
for a heuristic resolution of the weights/frequencies relationship suitable for
complex specifications. In the second alternative, the targeted distribution is
given by a natural numbers such that
where is the number of undistinguished atoms.
The structures must be generated uniformly among the set of structures of size
that contain {\em exactly} atoms \At_i (). We give
a \bigO{r^2\prod_{i=1}^k n_i^2 +m n k \log n} algorithm for generating
structures, which simplifies into a \bigO{r\prod_{i=1}^k n_i +m n} for
regular specifications
If the Current Clique Algorithms are Optimal, so is Valiant's Parser
The CFG recognition problem is: given a context-free grammar
and a string of length , decide if can be obtained from
. This is the most basic parsing question and is a core computer
science problem. Valiant's parser from 1975 solves the problem in
time, where is the matrix multiplication
exponent. Dozens of parsing algorithms have been proposed over the years, yet
Valiant's upper bound remains unbeaten. The best combinatorial algorithms have
mildly subcubic complexity.
Lee (JACM'01) provided evidence that fast matrix multiplication is needed for
CFG parsing, and that very efficient and practical algorithms might be hard or
even impossible to obtain. Lee showed that any algorithm for a more general
parsing problem with running time can
be converted into a surprising subcubic algorithm for Boolean Matrix
Multiplication. Unfortunately, Lee's hardness result required that the grammar
size be . Nothing was known for the more relevant
case of constant size grammars.
In this work, we prove that any improvement on Valiant's algorithm, even for
constant size grammars, either in terms of runtime or by avoiding the
inefficiencies of fast matrix multiplication, would imply a breakthrough
algorithm for the -Clique problem: given a graph on nodes, decide if
there are that form a clique.
Besides classifying the complexity of a fundamental problem, our reduction
has led us to similar lower bounds for more modern and well-studied cubic time
problems for which faster algorithms are highly desirable in practice: RNA
Folding, a central problem in computational biology, and Dyck Language Edit
Distance, answering an open question of Saha (FOCS'14)
Multi-dimensional Boltzmann Sampling of Languages
This paper addresses the uniform random generation of words from a
context-free language (over an alphabet of size ), while constraining every
letter to a targeted frequency of occurrence. Our approach consists in a
multidimensional extension of Boltzmann samplers \cite{Duchon2004}. We show
that, under mostly \emph{strong-connectivity} hypotheses, our samplers return a
word of size in and exact frequency in
expected time. Moreover, if we accept tolerance
intervals of width in for the number of occurrences of each
letters, our samplers perform an approximate-size generation of words in
expected time. We illustrate these techniques on the
generation of Tetris tessellations with uniform statistics in the different
types of tetraminoes.Comment: 12p
Generating all permutations by context-free grammars in Chomsky normal form
Let Ln be the finite language of all n! strings that are permutations of n different symbols (n1). We consider context-free grammars Gn in Chomsky normal form that generate Ln. In particular we study a few families {Gn}n1, satisfying L(Gn)=Ln for n1, with respect to their descriptional complexity, i.e. we determine the number of nonterminal symbols and the number of production rules of Gn as functions of n
Grammar-based Representation and Identification of Dynamical Systems
In this paper we propose a novel approach to identify dynamical systems. The
method estimates the model structure and the parameters of the model
simultaneously, automating the critical decisions involved in identification
such as model structure and complexity selection. In order to solve the
combined model structure and model parameter estimation problem, a new
representation of dynamical systems is proposed. The proposed representation is
based on Tree Adjoining Grammar, a formalism that was developed from linguistic
considerations. Using the proposed representation, the identification problem
can be interpreted as a multi-objective optimization problem and we propose a
Evolutionary Algorithm-based approach to solve the problem. A benchmark example
is used to demonstrate the proposed approach. The results were found to be
comparable to that obtained by state-of-the-art non-linear system
identification methods, without making use of knowledge of the system
description.Comment: Submitted to European Control Conference (ECC) 201
Generating All Permutations by Context-Free Grammars in Greibach Normal Form
We consider context-free grammars in Greibach normal form and, particularly, in Greibach -form () which generates the finite language of all strings that are permutations of different symbols (). These grammars are investigated with respect to their descriptional complexity, i.e., we determine the number of nonterminal symbols and the number of production rules of as functions of . As in the case of Chomsky normal form these descriptional complexity measures grow faster than any polynomial function
- …