1,558 research outputs found
Calibrating Generative Models: The Probabilistic Chomsky-Schützenberger Hierarchy
A probabilistic Chomsky–Schützenberger hierarchy of grammars is introduced and studied, with the aim of understanding the expressive power of generative models. We offer characterizations of the distributions definable at each level of the hierarchy, including probabilistic regular, context-free, (linear) indexed, context-sensitive, and unrestricted grammars, each corresponding to familiar probabilistic machine classes. Special attention is given to distributions on (unary notations for) positive integers. Unlike in the classical case where the "semi-linear" languages all collapse into the regular languages, using analytic tools adapted from the classical setting we show there is no collapse in the probabilistic hierarchy: more distributions become definable at each level. We also address related issues such as closure under probabilistic conditioning
Parametrized Stochastic Grammars for RNA Secondary Structure Prediction
We propose a two-level stochastic context-free grammar (SCFG) architecture
for parametrized stochastic modeling of a family of RNA sequences, including
their secondary structure. A stochastic model of this type can be used for
maximum a posteriori estimation of the secondary structure of any new sequence
in the family. The proposed SCFG architecture models RNA subsequences
comprising paired bases as stochastically weighted Dyck-language words, i.e.,
as weighted balanced-parenthesis expressions. The length of each run of
unpaired bases, forming a loop or a bulge, is taken to have a phase-type
distribution: that of the hitting time in a finite-state Markov chain. Without
loss of generality, each such Markov chain can be taken to have a bounded
complexity. The scheme yields an overall family SCFG with a manageable number
of parameters.Comment: 5 pages, submitted to the 2007 Information Theory and Applications
Workshop (ITA 2007
On external presentations of infinite graphs
The vertices of a finite state system are usually a subset of the natural
numbers. Most algorithms relative to these systems only use this fact to select
vertices.
For infinite state systems, however, the situation is different: in
particular, for such systems having a finite description, each state of the
system is a configuration of some machine. Then most algorithmic approaches
rely on the structure of these configurations. Such characterisations are said
internal. In order to apply algorithms detecting a structural property (like
identifying connected components) one may have first to transform the system in
order to fit the description needed for the algorithm. The problem of internal
characterisation is that it hides structural properties, and each solution
becomes ad hoc relatively to the form of the configurations.
On the contrary, external characterisations avoid explicit naming of the
vertices. Such characterisation are mostly defined via graph transformations.
In this paper we present two kind of external characterisations:
deterministic graph rewriting, which in turn characterise regular graphs,
deterministic context-free languages, and rational graphs. Inverse substitution
from a generator (like the complete binary tree) provides characterisation for
prefix-recognizable graphs, the Caucal Hierarchy and rational graphs. We
illustrate how these characterisation provide an efficient tool for the
representation of infinite state systems
A short essay on the interplay between algebraic language theory, galois theory and class field theory : comparing physics and theory of computation (Mathematical aspects of quantum fields and related topics)
This paper is written as a technical report for our talk given at the RJMS workshop on quantum fields and related topics, held on 6th- 8th December 2021. In this talk we introduced our recent works [23, 24, 25, 26] in formal language theory to the community of mathematical physics, which concern some interplay between algebraic language theory, galois theory and class field theory. In this paper we discuss some conceptual contents of our recent works [23, 24, 25, 26] in more detail
Complexity of Two-Dimensional Patterns
In dynamical systems such as cellular automata and iterated maps, it is often
useful to look at a language or set of symbol sequences produced by the system.
There are well-established classification schemes, such as the Chomsky
hierarchy, with which we can measure the complexity of these sets of sequences,
and thus the complexity of the systems which produce them.
In this paper, we look at the first few levels of a hierarchy of complexity
for two-or-more-dimensional patterns. We show that several definitions of
``regular language'' or ``local rule'' that are equivalent in d=1 lead to
distinct classes in d >= 2. We explore the closure properties and computational
complexity of these classes, including undecidability and L-, NL- and
NP-completeness results.
We apply these classes to cellular automata, in particular to their sets of
fixed and periodic points, finite-time images, and limit sets. We show that it
is undecidable whether a CA in d >= 2 has a periodic point of a given period,
and that certain ``local lattice languages'' are not finite-time images or
limit sets of any CA. We also show that the entropy of a d-dimensional CA's
finite-time image cannot decrease faster than t^{-d} unless it maps every
initial condition to a single homogeneous state.Comment: To appear in J. Stat. Phy
- …