Search CORE

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Archivio istituzionale della ricerca - Università di Palermo

Sparse approaches for the exact distribution of patterns in long state sequences generated by a Markov source

Author: Aho
Allauzen
Antzoulakos
Beaudoing
Boeva
Boeva
Brazma
Chang
Cormen
Cowan
Crochemore
Crochemore
Denise
El~Karoui
Erhardsson
Fiduccia
Frith
Fu
Geske
Godbole
Gregory Nuel
Hampson
Hopcroft
Hopcroft
Jean-Guillaume Dumas
Kaltofen
Karlin
Kleffe
Knuth
Le~Maout
Lladser
Mariño-Ramírez
Nicodème
Nuel
Nuel
Nuel
Nuel
Nuel
Nuel
Nuel
Pevzner
Prum
Reinert
Ribeca
Régnier
Stefanov
Stefanov
Storjohann
van Helden
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

We present two novel approaches for the computation of the exact distribution of a pattern in a long sequence. Both approaches take into account the sparse structure of the problem and are two-part algorithms. The first approach relies on a partial recursion after a fast computation of the second largest eigenvalue of the transition matrix of a Markov chain embedding. The second approach uses fast Taylor expansions of an exact bivariate rational reconstruction of the distribution. We illustrate the interest of both approaches on a simple toy-example and two biological applications: the transcription factors of the Human Chromosome 5 and the PROSITE signatures of functional motifs in proteins. On these example our methods demonstrate their complementarity and their hability to extend the domain of feasibility for exact computations in pattern problems to a new level

Hal - Université Grenoble Alpes

HAL Descartes

Comparator automata in quantitative verification

Author: Bansal Suguman
Chaudhuri Swarat
Vardi Moshe Y.
Publication venue
Publication date: 16/12/2018
Field of study

The notion of comparison between system runs is fundamental in formal verification. This concept is implicitly present in the verification of qualitative systems, and is more pronounced in the verification of quantitative systems. In this work, we identify a novel mode of comparison in quantitative systems: the online comparison of the aggregate values of two sequences of quantitative weights. This notion is embodied by {\em comparator automata} ({\em comparators}, in short), a new class of automata that read two infinite sequences of weights synchronously and relate their aggregate values. We show that {aggregate functions} that can be represented with B\"uchi automaton result in comparators that are finite-state and accept by the B\"uchi condition as well. Such {\em

\omega

-regular comparators} further lead to generic algorithms for a number of well-studied problems, including the quantitative inclusion and winning strategies in quantitative graph games with incomplete information, as well as related non-decision problems, such as obtaining a finite representation of all counterexamples in the quantitative inclusion problem. We study comparators for two aggregate functions: discounted-sum and limit-average. We prove that the discounted-sum comparator is

\omega

-regular iff the discount-factor is an integer. Not every aggregate function, however, has an

\omega

-regular comparator. Specifically, we show that the language of sequence-pairs for which limit-average aggregates exist is neither

\omega

-regular nor

\omega

-context-free. Given this result, we introduce the notion of {\em prefix-average} as a relaxation of limit-average aggregation, and show that it admits

\omega

-context-free comparators

Episciences.org

Incomplete Transition Complexity of Basic Operations on Finite Languages

Author: C. Câmpeanu
E. Maia
K. Salomaa
K.R. Beesley
S. Owens
Y. Gao
Y.S. Han
Publication venue
Publication date: 01/01/2013
Field of study

The state complexity of basic operations on finite languages (considering complete DFAs) has been in studied the literature. In this paper we study the incomplete (deterministic) state and transition complexity on finite languages of boolean operations, concatenation, star, and reversal. For all operations we give tight upper bounds for both description measures. We correct the published state complexity of concatenation for complete DFAs and provide a tight upper bound for the case when the right automaton is larger than the left one. For all binary operations the tightness is proved using family languages with a variable alphabet size. In general the operational complexities depend not only on the complexities of the operands but also on other refined measures.Comment: 13 page

Controlled non uniform random generation of decomposable structures

Author: A. Denise
Berghen
Bertoni
Bostan
Brlek
Denise
Denise
Dershowitz
Drmota
Duchon
Dutour
Faugère
Flajolet
Flajolet
Flajolet
Flajolet
Fontana
Goldwurm
Greene
Hofacker
Hofacker
Jin
Lipshitz
M. Termier
Mathews
Mathews
Nebel
Nebel
Nicodème
Nijenhuis
Ponty
Salvy
Schönhage
van der Hoeven
Vauchaussade de Chaumont
Waterman
Y. Ponty
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

Consider a class of decomposable combinatorial structures, using different types of atoms \Atoms = \{\At_1,\ldots ,\At_{|{\Atoms}|}\}. We address the random generation of such structures with respect to a size

n

and a targeted distribution in

k

of its \emph{distinguished} atoms. We consider two variations on this problem. In the first alternative, the targeted distribution is given by

k

real numbers \TargFreq_1, \ldots, \TargFreq_k such that 0 < \TargFreq_i < 1 for all

i

and \TargFreq_1+\cdots+\TargFreq_k \leq 1. We aim to generate random structures among the whole set of structures of a given size

n

, in such a way that the {\em expected} frequency of any distinguished atom \At_i equals \TargFreq_i. We address this problem by weighting the atoms with a

k

-tuple \Weights of real-valued weights, inducing a weighted distribution over the set of structures of size

n

. We first adapt the classical recursive random generation scheme into an algorithm taking \bigO{n^{1+o(1)}+mn\log{n}} arithmetic operations to draw

m

structures from the \Weights-weighted distribution. Secondly, we address the analytical computation of weights such that the targeted frequencies are achieved asymptotically, i. e. for large values of

n

. We derive systems of functional equations whose resolution gives an explicit relationship between \Weights and \TargFreq_1, \ldots, \TargFreq_k. Lastly, we give an algorithm in \bigO{k n^4} for the inverse problem, {\it i.e.} computing the frequencies associated with a given

k

-tuple \Weights of weights, and an optimized version in \bigO{k n^2} in the case of context-free languages. This allows for a heuristic resolution of the weights/frequencies relationship suitable for complex specifications. In the second alternative, the targeted distribution is given by a

k

natural numbers

n_1, \ldots, n_k

such that

n_1+\cdots+n_k+r=n

where

r \geq 0

is the number of undistinguished atoms. The structures must be generated uniformly among the set of structures of size

n

that contain {\em exactly}

n_i

atoms \At_i (

1 \leq i \leq k

). We give a \bigO{r^2\prod_{i=1}^k n_i^2 +m n k \log n} algorithm for generating

m

structures, which simplifies into a \bigO{r\prod_{i=1}^k n_i +m n} for regular specifications

Elsevier - Publisher Connector

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

Mean-payoff Automaton Expressions

Author: A. Ehrenfeucht
E. Vidal
J. Desharnais
K. Chatterjee
K. Chatterjee
K. Chatterjee
L. Alfaro de
L. Alfaro de
M. Bojanczyk
M. Droste
M. Droste
M. Droste
M.O. Rabin
M.P. Schützenberger
O. Kupferman
R. Alur
U. Zwick
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Quantitative languages are an extension of boolean languages that assign to each word a real number. Mean-payoff automata are finite automata with numerical weights on transitions that assign to each infinite path the long-run average of the transition weights. When the mode of branching of the automaton is deterministic, nondeterministic, or alternating, the corresponding class of quantitative languages is not robust as it is not closed under the pointwise operations of max, min, sum, and numerical complement. Nondeterministic and alternating mean-payoff automata are not decidable either, as the quantitative generalization of the problems of universality and language inclusion is undecidable. We introduce a new class of quantitative languages, defined by mean-payoff automaton expressions, which is robust and decidable: it is closed under the four pointwise operations, and we show that all decision problems are decidable for this class. Mean-payoff automaton expressions subsume deterministic mean-payoff automata, and we show that they have expressive power incomparable to nondeterministic and alternating mean-payoff automata. We also present for the first time an algorithm to compute distance between two quantitative languages, and in our case the quantitative languages are given as mean-payoff automaton expressions

CiteSeerX

IST Austria: PubRep (Institute of Science and Technology)

The power of linear programming for general-valued CSPs

Author: Kolmogorov Vladimir
Thapper Johan
Zivny Stanislav
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 25/11/2014
Field of study

Let

D

, called the domain, be a fixed finite set and let

\Gamma

, called the valued constraint language, be a fixed set of functions of the form

f:D^m\to\mathbb{Q}\cup\{\infty\}

, where different functions might have different arity

m

. We study the valued constraint satisfaction problem parametrised by

\Gamma

, denoted by VCSP

(\Gamma)

. These are minimisation problems given by

n

variables and the objective function given by a sum of functions from

\Gamma

, each depending on a subset of the

n

variables. Finite-valued constraint languages contain functions that take on only rational values and not infinite values. Our main result is a precise algebraic characterisation of valued constraint languages whose instances can be solved exactly by the basic linear programming relaxation (BLP). For a valued constraint language

\Gamma

, BLP is a decision procedure for

\Gamma

if and only if

\Gamma

admits a symmetric fractional polymorphism of every arity. For a finite-valued constraint language

\Gamma

, BLP is a decision procedure if and only if

\Gamma

admits a symmetric fractional polymorphism of some arity, or equivalently, if

\Gamma

admits a symmetric fractional polymorphism of arity 2. Using these results, we obtain tractability of several novel classes of problems, including problems over valued constraint languages that are: (1) submodular on arbitrary lattices; (2)

k

-submodular on arbitrary finite domains; (3) weakly (and hence strongly) tree-submodular on arbitrary trees.Comment: A full version of a FOCS'12 paper by the last two authors (arXiv:1204.1079) and an ICALP'13 paper by the first author (arXiv:1207.7213) to appear in SIAM Journal on Computing (SICOMP

IST Austria: PubRep (Institute of Science and Technology)

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Complexity vs energy: theory of computation and theoretical physics

Author: Manin Y.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2014
Field of study

MPG.PuRe

Flexible RNA design under structure and sequence constraints using formal languages

Author: Denise Alain
Ponty Yann
Vialette Stéphane
Waldispühl Jérôme
Zhang Yi
Zhou Yu
Publication venue
Publication date: 01/08/2013
Field of study

The problem of RNA secondary structure design (also called inverse folding) is the following: given a target secondary structure, one aims to create a sequence that folds into, or is compatible with, a given structure. In several practical applications in biology, additional constraints must be taken into account, such as the presence/absence of regulatory motifs, either at a specific location or anywhere in the sequence. In this study, we investigate the design of RNA sequences from their targeted secondary structure, given these additional sequence constraints. To this purpose, we develop a general framework based on concepts of language theory, namely context-free grammars and finite automata. We efficiently combine a comprehensive set of constraints into a unifying context-free grammar of moderate size. From there, we use generic generic algorithms to perform a (weighted) random generation, or an exhaustive enumeration, of candidate sequences. The resulting method, whose complexity scales linearly with the length of the RNA, was implemented as a standalone program. The resulting software was embedded into a publicly available dedicated web server. The applicability demonstrated of the method on a concrete case study dedicated to Exon Splicing Enhancers, in which our approach was successfully used in the design of \emph{in vitro} experiments.Comment: ACM BCB 2013 - ACM Conference on Bioinformatics, Computational Biology and Biomedical Informatics (2013

HAL-CentraleSupelec

INRIA a CCSD electronic archive server