Search CORE

5,978 research outputs found

Finite Automata for the Sub- and Superword Closure of CFLs: Descriptional and Computational Complexity

Author: A Okhotin
B Courcelle
C Brabrand
H Gruber
J Esparza
J Leeuwen van
M Mohri
MF Atig
N Rampersad
N Vasudevan
P Ganty
P Habermehl
R Axelsson
S Schmitz
Y Bar-Hillel
Z Long
Publication venue
Publication date: 23/10/2014
Field of study

We answer two open questions by (Gruber, Holzer, Kutrib, 2009) on the state-complexity of representing sub- or superword closures of context-free grammars (CFGs): (1) We prove a (tight) upper bound of

2^{\mathcal{O}(n)}

on the size of nondeterministic finite automata (NFAs) representing the subword closure of a CFG of size

n

. (2) We present a family of CFGs for which the minimal deterministic finite automata representing their subword closure matches the upper-bound of

2^{2^{\mathcal{O}(n)}}

following from (1). Furthermore, we prove that the inequivalence problem for NFAs representing sub- or superword-closed languages is only NP-complete as opposed to PSPACE-complete for general NFAs. Finally, we extend our results into an approximation method to attack inequivalence problems for CFGs

arXiv.org e-Print Archive

CiteSeerX

Crossref

Criticality in Formal Languages and Statistical Physics

Author: Lin Henry W.
Tegmark Max
Publication venue: 'MDPI AG'
Publication date: 23/06/2017
Field of study

We show that the mutual information between two symbols, as a function of the number of symbols between the two, decays exponentially in any probabilistic regular grammar, but can decay like a power law for a context-free grammar. This result about formal languages is closely related to a well-known result in classical statistical mechanics that there are no phase transitions in dimensions fewer than two. It is also related to the emergence of power-law correlations in turbulence and cosmological inflation through recursive generative processes. We elucidate these physics connections and comment on potential applications of our results to machine learning tasks like training artificial recurrent neural networks. Along the way, we introduce a useful quantity which we dub the rational mutual information and discuss generalizations of our claims involving more complicated Bayesian networks.Comment: Replaced to match final published version. Discussion improved, references adde

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Synthesizing Program Input Grammars

Author: Albarghouthi A.
Cadar C.
Cho C. Y.
Forrester J. E.
Godefroid P.
Holler C.
Huang L.
Lee L.
Oncina J.
Solomonoff R. J.
Sutton M.
Sutton M.
Vardhan A.
Viide J.
Wondracek G.
Publication venue
Publication date: 16/06/2017
Field of study

We present an algorithm for synthesizing a context-free grammar encoding the language of valid program inputs from a set of input examples and blackbox access to the program. Our algorithm addresses shortcomings of existing grammar inference algorithms, which both severely overgeneralize and are prohibitively slow. Our implementation, GLADE, leverages the grammar synthesized by our algorithm to fuzz test programs with structured inputs. We show that GLADE substantially increases the incremental coverage on valid inputs compared to two baseline fuzzers

arXiv.org e-Print Archive

Crossref

Controlled non uniform random generation of decomposable structures

Author: A. Denise
Berghen
Bertoni
Bostan
Brlek
Denise
Denise
Dershowitz
Drmota
Duchon
Dutour
Faugère
Flajolet
Flajolet
Flajolet
Flajolet
Fontana
Goldwurm
Greene
Hofacker
Hofacker
Jin
Lipshitz
M. Termier
Mathews
Mathews
Nebel
Nebel
Nicodème
Nijenhuis
Ponty
Salvy
Schönhage
van der Hoeven
Vauchaussade de Chaumont
Waterman
Y. Ponty
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

Consider a class of decomposable combinatorial structures, using different types of atoms \Atoms = \{\At_1,\ldots ,\At_{|{\Atoms}|}\}. We address the random generation of such structures with respect to a size

n

and a targeted distribution in

k

of its \emph{distinguished} atoms. We consider two variations on this problem. In the first alternative, the targeted distribution is given by

k

real numbers \TargFreq_1, \ldots, \TargFreq_k such that 0 < \TargFreq_i < 1 for all

i

and \TargFreq_1+\cdots+\TargFreq_k \leq 1. We aim to generate random structures among the whole set of structures of a given size

n

, in such a way that the {\em expected} frequency of any distinguished atom \At_i equals \TargFreq_i. We address this problem by weighting the atoms with a

k

-tuple \Weights of real-valued weights, inducing a weighted distribution over the set of structures of size

n

. We first adapt the classical recursive random generation scheme into an algorithm taking \bigO{n^{1+o(1)}+mn\log{n}} arithmetic operations to draw

m

structures from the \Weights-weighted distribution. Secondly, we address the analytical computation of weights such that the targeted frequencies are achieved asymptotically, i. e. for large values of

n

. We derive systems of functional equations whose resolution gives an explicit relationship between \Weights and \TargFreq_1, \ldots, \TargFreq_k. Lastly, we give an algorithm in \bigO{k n^4} for the inverse problem, {\it i.e.} computing the frequencies associated with a given

k

-tuple \Weights of weights, and an optimized version in \bigO{k n^2} in the case of context-free languages. This allows for a heuristic resolution of the weights/frequencies relationship suitable for complex specifications. In the second alternative, the targeted distribution is given by a