Search CORE

18,480 research outputs found

Rational stochastic languages

Author: Denis François
Esposito Yann
Publication venue
Publication date: 01/01/2006
Field of study

The goal of the present paper is to provide a systematic and comprehensive study of rational stochastic languages over a semiring K \in {Q, Q +, R, R+}. A rational stochastic language is a probability distribution over a free monoid \Sigma^* which is rational over K, that is which can be generated by a multiplicity automata with parameters in K. We study the relations between the classes of rational stochastic languages S rat K (\Sigma). We define the notion of residual of a stochastic language and we use it to investigate properties of several subclasses of rational stochastic languages. Lastly, we study the representation of rational stochastic languages by means of multiplicity automata.Comment: 35 page

arXiv.org e-Print Archive

CiteSeerX

HAL AMU

Learning rational stochastic languages

Author: Denis François
Esposito Yann
Habrard Amaury
Publication venue: HAL CCSD
Publication date: 17/02/2006
Field of study

15 pagesGiven a finite set of words w1,...,wn independently drawn according to a fixed unknown distribution law P called a stochastic language, an usual goal in Grammatical Inference is to infer an estimate of P in some class of probabilistic models, such as Probabilistic Automata (PA). Here, we study the class of rational stochastic languages, which consists in stochastic languages that can be generated by Multiplicity Automata (MA) and which strictly includes the class of stochastic languages generated by PA. Rational stochastic languages have minimal normal representation which may be very concise, and whose parameters can be efficiently estimated from stochastic samples. We design an efficient inference algorithm DEES which aims at building a minimal normal representation of the target. Despite the fact that no recursively enumerable class of MA computes exactly the set of rational stochastic languages over Q, we show that DEES strongly identifies tis set in the limit. We study the intermediary MA output by DEES and show that they compute rational series which converge absolutely to one and which can be used to provide stochastic languages which closely estimate the target

HAL AMU

Relevant Representations for the Inference of Rational Stochastic Tree Languages

Author: Denis François
Gilbert Edouard
Habrard Amaury
Ouardi Faïssal
Tommasi Marc
Publication venue: Springer Verlag
Publication date: 01/01/2008
Field of study

International audienceRecently, an algorithm, DEES, was proposed for learning rational stochastic tree languages. Given an independantly and identically distributed sample of trees, drawn according to a rational stochastic language, DEES outputs a linear representation of a rational series which converges to the target. DEES can then be used to identify in the limit with probability one rational stochastic tree languages. However, when DEES deals with finite samples, it often outputs a rational tree series which does not define a stochastic language. Moreover, the linear representation can not be directly used as a generative model. In this paper, we show that any representation of a rational stochastic tree language can be transformed in a reduced normalised representation that can be used to generate trees from the underlying distribution. We also study some properties of consistency for rational stochastic tree languages and discuss their implication for the inference. We finally consider the applicability of DEES to trees built over an unranked alphabet

HAL - Lille 3

HAL AMU

INRIA a CCSD electronic archive server

Parametrized Stochastic Grammars for RNA Secondary Structure Prediction

Author: Maier Robert S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

We propose a two-level stochastic context-free grammar (SCFG) architecture for parametrized stochastic modeling of a family of RNA sequences, including their secondary structure. A stochastic model of this type can be used for maximum a posteriori estimation of the secondary structure of any new sequence in the family. The proposed SCFG architecture models RNA subsequences comprising paired bases as stochastically weighted Dyck-language words, i.e., as weighted balanced-parenthesis expressions. The length of each run of unpaired bases, forming a loop or a bulge, is taken to have a phase-type distribution: that of the hitting time in a finite-state Markov chain. Without loss of generality, each such Markov chain can be taken to have a bounded complexity. The scheme yields an overall family SCFG with a manageable number of parameters.Comment: 5 pages, submitted to the 2007 Information Theory and Applications Workshop (ITA 2007

arXiv.org e-Print Archive

CiteSeerX

Crossref

Calibrating Generative Models: The Probabilistic Chomsky-Schützenberger Hierarchy

Author: Icard Thomas
Publication venue
Publication date: 01/01/2020
Field of study

A probabilistic Chomsky–Schützenberger hierarchy of grammars is introduced and studied, with the aim of understanding the expressive power of generative models. We offer characterizations of the distributions definable at each level of the hierarchy, including probabilistic regular, context-free, (linear) indexed, context-sensitive, and unrestricted grammars, each corresponding to familiar probabilistic machine classes. Special attention is given to distributions on (unary notations for) positive integers. Unlike in the classical case where the "semi-linear" languages all collapse into the regular languages, using analytic tools adapted from the classical setting we show there is no collapse in the probabilistic hierarchy: more distributions become definable at each level. We also address related issues such as closure under probabilistic conditioning

PhilPapers

Computable de Finetti measures

Author: Aldous
Aldous
Austin
Battenfeld
Billingsley
Bosserhoff
Brattka
Brattka
Braverman
Cameron E. Freer
Daniel M. Roy
Dawid
de~Finetti
de~Finetti
de~Finetti
Diaconis
Diaconis
Edalat
Escardó
Escardó
Freer
Goodman
Griffiths
Grubba
Hewitt
Kallenberg
Kallenberg
Kemp
Kingman
Kiselyov
Lauritzen
Müller
Park
Pfeffer
Plotkin
Pour-El
Rogers
Roy
Roy
Ryll-Nardzewski
Saheb-Djahromi
Schröder
Schröder
Sethuraman
Soare
Teh
Thibaux
Weihrauch
Weihrauch
Weihrauch
Weihrauch
Wolpert
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

We prove a computable version of de Finetti's theorem on exchangeable sequences of real random variables. As a consequence, exchangeable stochastic processes expressed in probabilistic functional programming languages can be automatically rewritten as procedures that do not modify non-local state. Along the way, we prove that a distribution on the unit interval is computable if and only if its moments are uniformly computable.Comment: 32 pages. Final journal version; expanded somewhat, with minor corrections. To appear in Annals of Pure and Applied Logic. Extended abstract appeared in Proceedings of CiE '09, LNCS 5635, pp. 218-23

arXiv.org e-Print Archive

CiteSeerX

Crossref

Elsevier - Publisher Connector

Criticality in Formal Languages and Statistical Physics

Author: Lin Henry W.
Tegmark Max
Publication venue: 'MDPI AG'
Publication date: 23/06/2017
Field of study

We show that the mutual information between two symbols, as a function of the number of symbols between the two, decays exponentially in any probabilistic regular grammar, but can decay like a power law for a context-free grammar. This result about formal languages is closely related to a well-known result in classical statistical mechanics that there are no phase transitions in dimensions fewer than two. It is also related to the emergence of power-law correlations in turbulence and cosmological inflation through recursive generative processes. We elucidate these physics connections and comment on potential applications of our results to machine learning tasks like training artificial recurrent neural networks. Along the way, we introduce a useful quantity which we dub the rational mutual information and discuss generalizations of our claims involving more complicated Bayesian networks.Comment: Replaced to match final published version. Discussion improved, references adde

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute