12,984 research outputs found
CLP-based protein fragment assembly
The paper investigates a novel approach, based on Constraint Logic
Programming (CLP), to predict the 3D conformation of a protein via fragments
assembly. The fragments are extracted by a preprocessor-also developed for this
work- from a database of known protein structures that clusters and classifies
the fragments according to similarity and frequency. The problem of assembling
fragments into a complete conformation is mapped to a constraint solving
problem and solved using CLP. The constraint-based model uses a medium
discretization degree Ca-side chain centroid protein model that offers
efficiency and a good approximation for space filling. The approach adapts
existing energy models to the protein representation used and applies a large
neighboring search strategy. The results shows the feasibility and efficiency
of the method. The declarative nature of the solution allows to include future
extensions, e.g., different size fragments for better accuracy.Comment: special issue dedicated to ICLP 201
Algorithm engineering for optimal alignment of protein structure distance matrices
Protein structural alignment is an important problem in computational
biology. In this paper, we present first successes on provably optimal pairwise
alignment of protein inter-residue distance matrices, using the popular Dali
scoring function. We introduce the structural alignment problem formally, which
enables us to express a variety of scoring functions used in previous work as
special cases in a unified framework. Further, we propose the first
mathematical model for computing optimal structural alignments based on dense
inter-residue distance matrices. We therefore reformulate the problem as a
special graph problem and give a tight integer linear programming model. We
then present algorithm engineering techniques to handle the huge integer linear
programs of real-life distance matrix alignment problems. Applying these
techniques, we can compute provably optimal Dali alignments for the very first
time
RNAiFold2T: Constraint Programming design of thermo-IRES switches
Motivation: RNA thermometers (RNATs) are cis-regulatory ele- ments that
change secondary structure upon temperature shift. Often involved in the
regulation of heat shock, cold shock and virulence genes, RNATs constitute an
interesting potential resource in synthetic biology, where engineered RNATs
could prove to be useful tools in biosensors and conditional gene regulation.
Results: Solving the 2-temperature inverse folding problem is critical for RNAT
engineering. Here we introduce RNAiFold2T, the first Constraint Programming
(CP) and Large Neighborhood Search (LNS) algorithms to solve this problem.
Benchmarking tests of RNAiFold2T against existent programs (adaptive walk and
genetic algorithm) inverse folding show that our software generates two orders
of magnitude more solutions, thus allow- ing ample exploration of the space of
solutions. Subsequently, solutions can be prioritized by computing various
measures, including probability of target structure in the ensemble, melting
temperature, etc. Using this strategy, we rationally designed two thermosensor
internal ribosome entry site (thermo-IRES) elements, whose normalized
cap-independent transla- tion efficiency is approximately 50% greater at 42?C
than 30?C, when tested in reticulocyte lysates. Translation efficiency is lower
than that of the wild-type IRES element, which on the other hand is fully
resistant to temperature shift-up. This appears to be the first purely
computational design of functional RNA thermoswitches, and certainly the first
purely computational design of functional thermo-IRES elements. Availability:
RNAiFold2T is publicly available as as part of the new re- lease RNAiFold3.0 at
https://github.com/clotelab/RNAiFold and http:
//bioinformatics.bc.edu/clotelab/RNAiFold, which latter has a web server as
well. The software is written in C++ and uses OR-Tools CP search engine.Comment: 24 pages, 5 figures, Intelligent Systems for Molecular Biology (ISMB
2016), to appear in journal Bioinformatics 201
Complete RNA inverse folding: computational design of functional hammerhead ribozymes
Nanotechnology and synthetic biology currently constitute one of the most
innovative, interdisciplinary fields of research, poised to radically transform
society in the 21st century. This paper concerns the synthetic design of
ribonucleic acid molecules, using our recent algorithm, RNAiFold, which can
determine all RNA sequences whose minimum free energy secondary structure is a
user-specified target structure. Using RNAiFold, we design ten cis-cleaving
hammerhead ribozymes, all of which are shown to be functional by a cleavage
assay. We additionally use RNAiFold to design a functional cis-cleaving
hammerhead as a modular unit of a synthetic larger RNA. Analysis of kinetics on
this small set of hammerheads suggests that cleavage rate of computationally
designed ribozymes may be correlated with positional entropy, ensemble defect,
structural flexibility/rigidity and related measures. Artificial ribozymes have
been designed in the past either manually or by SELEX (Systematic Evolution of
Ligands by Exponential Enrichment); however, this appears to be the first
purely computational design and experimental validation of novel functional
ribozymes. RNAiFold is available at
http://bioinformatics.bc.edu/clotelab/RNAiFold/.Comment: 17 pages, 2 tables, 7 figures, final version to appear in Nucleic
Acids Researc
Recommended from our members
A computer system to perform structure comparison using TOPS representations of protein structure
We describe the design and implementation of a fast topology–based method
for protein structure comparison. The approach uses the TOPS topological representation
of protein structure, aligning two structures using a common discovered
pattern and generating measure of distance derived from an insert score. Heavy
use is made of a constraint-based pattern matching algorithm for TOPS diagrams
that we have designed and described elsewhere Gilbert et al. (1999). The comparison
system is maintained at the European Bioinformatics Institute and is available
over the Web via the at tops.ebi.ac.uk/tops. Users submit a structure description in
Protein Data Bank (PDB) format and can compare it with structures in the entire
PDB or a representative subset of protein domains, receiving the results by email
A Graph Grammar for Modelling RNA Folding
We propose a new approach for modelling the process of RNA folding as a graph
transformation guided by the global value of free energy. Since the folding
process evolves towards a configuration in which the free energy is minimal,
the global behaviour resembles the one of a self-adaptive system. Each RNA
configuration is a graph and the evolution of configurations is constrained by
precise rules that can be described by a graph grammar.Comment: In Proceedings GaM 2016, arXiv:1612.0105
Flexible RNA design under structure and sequence constraints using formal languages
The problem of RNA secondary structure design (also called inverse folding)
is the following: given a target secondary structure, one aims to create a
sequence that folds into, or is compatible with, a given structure. In several
practical applications in biology, additional constraints must be taken into
account, such as the presence/absence of regulatory motifs, either at a
specific location or anywhere in the sequence. In this study, we investigate
the design of RNA sequences from their targeted secondary structure, given
these additional sequence constraints. To this purpose, we develop a general
framework based on concepts of language theory, namely context-free grammars
and finite automata. We efficiently combine a comprehensive set of constraints
into a unifying context-free grammar of moderate size. From there, we use
generic generic algorithms to perform a (weighted) random generation, or an
exhaustive enumeration, of candidate sequences. The resulting method, whose
complexity scales linearly with the length of the RNA, was implemented as a
standalone program. The resulting software was embedded into a publicly
available dedicated web server. The applicability demonstrated of the method on
a concrete case study dedicated to Exon Splicing Enhancers, in which our
approach was successfully used in the design of \emph{in vitro} experiments.Comment: ACM BCB 2013 - ACM Conference on Bioinformatics, Computational
Biology and Biomedical Informatics (2013
- …