9,789 research outputs found
Using tabu search and genetic algorithms in mathematics research
This paper discusses an ongoing project which uses computational heuristic search techniques such as tabu search and genetic algorithms as a tool for mathematics research. We discuss three ways in which such search techniques can be useful for mathematicians: in nding counterexamples to conjectures, in enumerating examples, and in nding sequences of transformations between two objects which are conjectured to be related. These problem-types are discussed using examples from topology
Single DNA conformations and biological function
From a nanoscience perspective, cellular processes and their reduced in vitro
imitations provide extraordinary examples for highly robust few or single
molecule reaction pathways. A prime example are biochemical reactions involving
DNA molecules, and the coupling of these reactions to the physical
conformations of DNA. In this review, we summarise recent results on the
following phenomena: We investigate the biophysical properties of DNA-looping
and the equilibrium configurations of DNA-knots, whose relevance to biological
processes are increasingly appreciated. We discuss how random DNA-looping may
be related to the efficiency of the target search process of proteins for their
specific binding site on the DNA molecule. And we dwell on the spontaneous
formation of intermittent DNA nanobubbles and their importance for biological
processes, such as transcription initiation. The physical properties of DNA may
indeed turn out to be particularly suitable for the use of DNA in nanosensing
applications.Comment: 53 pages, 45 figures. Slightly revised version of a review article,
that is going to appear in the J. Comput. Theoret. Nanoscience; some typos
correcte
Folding Kinetics of Protein Like Heteropolymers
Using a simple three-dimensional lattice copolymer model and Monte Carlo
dynamics, we study the collapse and folding of protein-like heteropolymers. The
polymers are 27 monomers long and consist of two monomer types. Although these
chains are too long for exhaustive enumeration of all conformations, it is
possible to enumerate all the maximally compact conformations, which are 3x3x3
cubes. This allows us to select sequences that have a unique global minimum. We
then explore the kinetics of collapse and folding and examine what features
determine the various rates. The folding time has a plateau over a broad range
of temperatures and diverges at both high and low temperatures. The folding
time depends on sequence and is related to the amount of energetic frustration
in the native state. The collapse times of the chains are sequence independent
and are a few orders of magnitude faster than the folding times, indicating a
two-phase folding process. Below a certain temperature the chains exhibit
glass-like behavior, characterized by a slowing down of time scales and loss of
self-averaging behavior. We explicitly define the glass transition temperature
(Tg), and by comparing it to the folding temperature (Tf), we find two classes
of sequences: good folders with Tf > Tg and non-folders with Tf < Tg.Comment: 23 pages (plus 10 figures included in a seperate file) LaTeX, no
local report nu
Experimental approximation of the Jones polynomial with DQC1
We present experimental results approximating the Jones polynomial using 4
qubits in a liquid state nuclear magnetic resonance quantum information
processor. This is the first experimental implementation of a complete problem
for the deterministic quantum computation with one quantum bit model of quantum
computation, which uses a single qubit accompanied by a register of completely
random states. The Jones polynomial is a knot invariant that is important not
only to knot theory, but also to statistical mechanics and quantum field
theory. The implemented algorithm is a modification of the algorithm developed
by Shor and Jordan suitable for implementation in NMR. These experimental
results show that for the restricted case of knots whose braid representations
have four strands and exactly three crossings, identifying distinct knots is
possible 91% of the time.Comment: 5 figures. Version 2 changes: published version, minor errors
corrected, slight changes to improve readabilit
Multiple Testing and Variable Selection along Least Angle Regression's path
In this article, we investigate multiple testing and variable selection using
Least Angle Regression (LARS) algorithm in high dimensions under the Gaussian
noise assumption. LARS is known to produce a piecewise affine solutions path
with change points referred to as knots of the LARS path. The cornerstone of
the present work is the expression in closed form of the exact joint law of
K-uplets of knots conditional on the variables selected by LARS, namely the
so-called post-selection joint law of the LARS knots. Numerical experiments
demonstrate the perfect fit of our finding.
Our main contributions are three fold. First, we build testing procedures on
variables entering the model along the LARS path in the general design case
when the noise level can be unknown. This testing procedures are referred to as
the Generalized t-Spacing tests (GtSt) and we prove that they have exact
non-asymptotic level (i.e., Type I error is exactly controlled). In that way,
we extend a work from (Taylor et al., 2014) where the Spacing test works for
consecutive knots and known variance. Second, we introduce a new exact multiple
false negatives test after model selection in the general design case when the
noise level can be unknown. We prove that this testing procedure has exact
non-asymptotic level for general design and unknown noise level. Last, we give
an exact control of the false discovery rate (FDR) under orthogonal design
assumption. Monte-Carlo simulations and a real data experiment are provided to
illustrate our results in this case. Of independent interest, we introduce an
equivalent formulation of LARS algorithm based on a recursive function.Comment: 62 pages; new: FDR control and power comparison between Knockoff,
FCD, Slope and our proposed method; new: the introduction has been revised
and now present a synthetic presentation of the main results. We believe that
this introduction brings new insists compared to previous version
Flexible RNA design under structure and sequence constraints using formal languages
The problem of RNA secondary structure design (also called inverse folding)
is the following: given a target secondary structure, one aims to create a
sequence that folds into, or is compatible with, a given structure. In several
practical applications in biology, additional constraints must be taken into
account, such as the presence/absence of regulatory motifs, either at a
specific location or anywhere in the sequence. In this study, we investigate
the design of RNA sequences from their targeted secondary structure, given
these additional sequence constraints. To this purpose, we develop a general
framework based on concepts of language theory, namely context-free grammars
and finite automata. We efficiently combine a comprehensive set of constraints
into a unifying context-free grammar of moderate size. From there, we use
generic generic algorithms to perform a (weighted) random generation, or an
exhaustive enumeration, of candidate sequences. The resulting method, whose
complexity scales linearly with the length of the RNA, was implemented as a
standalone program. The resulting software was embedded into a publicly
available dedicated web server. The applicability demonstrated of the method on
a concrete case study dedicated to Exon Splicing Enhancers, in which our
approach was successfully used in the design of \emph{in vitro} experiments.Comment: ACM BCB 2013 - ACM Conference on Bioinformatics, Computational
Biology and Biomedical Informatics (2013
McGenus: A Monte Carlo algorithm to predict RNA secondary structures with pseudoknots
We present McGenus, an algorithm to predict RNA secondary structures with
pseudoknots. The method is based on a classification of RNA structures
according to their topological genus. McGenus can treat sequences of up to 1000
bases and performs an advanced stochastic search of their minimum free energy
structure allowing for non trivial pseudoknot topologies. Specifically, McGenus
employs a multiple Markov chain scheme for minimizing a general scoring
function which includes not only free energy contributions for pair stacking,
loop penalties, etc. but also a phenomenological penalty for the genus of the
pairing graph. The good performance of the stochastic search strategy was
successfully validated against TT2NE which uses the same free energy
parametrization and performs exhaustive or partially exhaustive structure
search, albeit for much shorter sequences (up to 200 bases). Next, the method
was applied to other RNA sets, including an extensive tmRNA database, yielding
results that are competitive with existing algorithms. Finally, it is shown
that McGenus highlights possible limitations in the free energy scoring
function. The algorithm is available as a web-server at
http://ipht.cea.fr/rna/mcgenus.php .Comment: 6 pages, 1 figur
RNA secondary structure design
We consider the inverse-folding problem for RNA secondary structures: for a
given (pseudo-knot-free) secondary structure find a sequence that has that
structure as its ground state. If such a sequence exists, the structure is
called designable. We implemented a branch-and-bound algorithm that is able to
do an exhaustive search within the sequence space, i.e., gives an exact answer
whether such a sequence exists. The bound required by the branch-and-bound
algorithm are calculated by a dynamic programming algorithm. We consider
different alphabet sizes and an ensemble of random structures, which we want to
design. We find that for two letters almost none of these structures are
designable. The designability improves for the three-letter case, but still a
significant fraction of structures is undesignable. This changes when we look
at the natural four-letter case with two pairs of complementary bases:
undesignable structures are the exception, although they still exist. Finally,
we also study the relation between designability and the algorithmic complexity
of the branch-and-bound algorithm. Within the ensemble of structures, a high
average degree of undesignability is correlated to a long time to prove that a
given structure is (un-)designable. In the four-letter case, where the
designability is high everywhere, the algorithmic complexity is highest in the
region of naturally occurring RNA.Comment: 11 pages, 10 figure
- âŠ