Search CORE

64,083 research outputs found

BOSS: Bayesian Optimization over String Spaces

Author: Beck Daniel
Gonzalez Javier
Leslie David S.
Moss Henry B.
Rayson Paul
Publication venue
Publication date: 02/10/2020
Field of study

This article develops a Bayesian optimization (BO) method which acts directly over raw strings, proposing the first uses of string kernels and genetic algorithms within BO loops. Recent applications of BO over strings have been hindered by the need to map inputs into a smooth and unconstrained latent space. Learning this projection is computationally and data-intensive. Our approach instead builds a powerful Gaussian process surrogate model based on string kernels, naturally supporting variable length inputs, and performs efficient acquisition function maximization for spaces with syntactical constraints. Experiments demonstrate considerably improved optimization over existing approaches across a broad range of constraints, including the popular setting where syntax is governed by a context-free grammar

arXiv.org e-Print Archive

Lancaster E-Prints

Decompositions of Grammar Constraints

Author: Quimper Claude-Guy
Walsh Toby
Publication venue
Publication date: 01/01/2008
Field of study

A wide range of constraints can be compactly specified using automata or formal languages. In a sequence of recent papers, we have shown that an effective means to reason with such specifications is to decompose them into primitive constraints. We can then, for instance, use state of the art SAT solvers and profit from their advanced features like fast unit propagation, clause learning, and conflict-based search heuristics. This approach holds promise for solving combinatorial problems in scheduling, rostering, and configuration, as well as problems in more diverse areas like bioinformatics, software testing and natural language processing. In addition, decomposition may be an effective method to propagate other global constraints.Comment: Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligenc

arXiv.org e-Print Archive

CiteSeerX

PolyPublie

Flexible RNA design under structure and sequence constraints using formal languages

Author: Denise Alain
Ponty Yann
Vialette Stéphane
Waldispühl Jérôme
Zhang Yi
Zhou Yu
Publication venue
Publication date: 01/08/2013
Field of study

The problem of RNA secondary structure design (also called inverse folding) is the following: given a target secondary structure, one aims to create a sequence that folds into, or is compatible with, a given structure. In several practical applications in biology, additional constraints must be taken into account, such as the presence/absence of regulatory motifs, either at a specific location or anywhere in the sequence. In this study, we investigate the design of RNA sequences from their targeted secondary structure, given these additional sequence constraints. To this purpose, we develop a general framework based on concepts of language theory, namely context-free grammars and finite automata. We efficiently combine a comprehensive set of constraints into a unifying context-free grammar of moderate size. From there, we use generic generic algorithms to perform a (weighted) random generation, or an exhaustive enumeration, of candidate sequences. The resulting method, whose complexity scales linearly with the length of the RNA, was implemented as a standalone program. The resulting software was embedded into a publicly available dedicated web server. The applicability demonstrated of the method on a concrete case study dedicated to Exon Splicing Enhancers, in which our approach was successfully used in the design of \emph{in vitro} experiments.Comment: ACM BCB 2013 - ACM Conference on Bioinformatics, Computational Biology and Biomedical Informatics (2013

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL-Polytechnique

HAL - UPEC / UPEM

Taking Primitive Optimality Theory Beyond the Finite State

Author: Albro Daniel
Publication venue
Publication date: 01/01/2000
Field of study

Primitive Optimality Theory (OTP) (Eisner, 1997a; Albro, 1998), a computational model of Optimality Theory (Prince and Smolensky, 1993), employs a finite state machine to represent the set of active candidates at each stage of an Optimality Theoretic derivation, as well as weighted finite state machines to represent the constraints themselves. For some purposes, however, it would be convenient if the set of candidates were limited by some set of criteria capable of being described only in a higher-level grammar formalism, such as a Context Free Grammar, a Context Sensitive Grammar, or a Multiple Context Free Grammar (Seki et al., 1991). Examples include reduplication and phrasal stress models. Here we introduce a mechanism for OTP-like Optimality Theory in which the constraints remain weighted finite state machines, but sets of candidates are represented by higher-level grammars. In particular, we use multiple context-free grammars to model reduplication in the manner of Correspondence Theory (McCarthy and Prince, 1995), and develop an extended version of the Earley Algorithm (Earley, 1970) to apply the constraints to a reduplicating candidate set.Comment: 11 pages, 5 figures, worksho

arXiv.org e-Print Archive

CiteSeerX