13,277 research outputs found
Geometric combinatorics and computational molecular biology: branching polytopes for RNA sequences
Questions in computational molecular biology generate various discrete
optimization problems, such as DNA sequence alignment and RNA secondary
structure prediction. However, the optimal solutions are fundamentally
dependent on the parameters used in the objective functions. The goal of a
parametric analysis is to elucidate such dependencies, especially as they
pertain to the accuracy and robustness of the optimal solutions. Techniques
from geometric combinatorics, including polytopes and their normal fans, have
been used previously to give parametric analyses of simple models for DNA
sequence alignment and RNA branching configurations. Here, we present a new
computational framework, and proof-of-principle results, which give the first
complete parametric analysis of the branching portion of the nearest neighbor
thermodynamic model for secondary structure prediction for real RNA sequences.Comment: 17 pages, 8 figure
A Seeded Genetic Algorithm for RNA Secondary Structural Prediction with Pseudoknots
This work explores a new approach in using genetic algorithm to predict RNA secondary structures with pseudoknots. Since only a small portion of most RNA structures is comprised of pseudoknots, the majority of structural elements from an optimal pseudoknot-free structure are likely to be part of the true structure. Thus seeding the genetic algorithm with optimal pseudoknot-free structures will more likely lead it to the true structure than a randomly generated population. The genetic algorithm uses the known energy models with an additional augmentation to allow complex pseudoknots. The nearest-neighbor energy model is used in conjunction with Turner’s thermodynamic parameters for pseudoknot-free structures, and the H-type pseudoknot energy estimation for simple pseudoknots. Testing with known pseudoknot sequences from PseudoBase shows that it out performs some of the current popular algorithms
A new procedure to analyze RNA non-branching structures
RNA structure prediction and structural motifs analysis are challenging tasks in the investigation of RNA function. We propose a novel procedure to detect structural motifs shared between two RNAs (a reference and a target). In particular, we developed two core modules: (i) nbRSSP_extractor, to assign a unique structure to the reference RNA encoded by a set of non-branching structures; (ii) SSD_finder, to detect structural motifs that the target RNA shares with the reference, by means of a new score function that rewards the relative distance of the target non-branching structures compared to the reference ones. We integrated these algorithms with already existing software to reach a coherent pipeline able to perform the following two main tasks: prediction of RNA structures (integration of RNALfold and nbRSSP_extractor) and search for chains of matches (integration of Structator and SSD_finder)
A complex adaptive systems approach to the kinetic folding of RNA
The kinetic folding of RNA sequences into secondary structures is modeled as
a complex adaptive system, the components of which are possible RNA structural
rearrangements (SRs) and their associated bases and base pairs. RNA bases and
base pairs engage in local stacking interactions that determine the
probabilities (or fitnesses) of possible SRs. Meanwhile, selection operates at
the level of SRs; an autonomous stochastic process periodically (i.e., from one
time step to another) selects a subset of possible SRs for realization based on
the fitnesses of the SRs. Using examples based on selected natural and
synthetic RNAs, the model is shown to qualitatively reproduce characteristic
(nonlinear) RNA folding dynamics such as the attainment by RNAs of alternative
stable states. Possible applications of the model to the analysis of properties
of fitness landscapes, and of the RNA sequence to structure mapping are
discussed.Comment: 23 pages, 4 figures, 2 tables, to be published in BioSystems (Note:
updated 2 references
Parametric inference of recombination in HIV genomes
Recombination is an important event in the evolution of HIV. It affects the
global spread of the pandemic as well as evolutionary escape from host immune
response and from drug therapy within single patients. Comprehensive
computational methods are needed for detecting recombinant sequences in large
databases, and for inferring the parental sequences.
We present a hidden Markov model to annotate a query sequence as a
recombinant of a given set of aligned sequences. Parametric inference is used
to determine all optimal annotations for all parameters of the model. We show
that the inferred annotations recover most features of established hand-curated
annotations. Thus, parametric analysis of the hidden Markov model is feasible
for HIV full-length genomes, and it improves the detection and annotation of
recombinant forms.
All computational results, reference alignments, and C++ source code are
available at http://bio.math.berkeley.edu/recombination/.Comment: 20 pages, 5 figure
TinkerCell: Modular CAD Tool for Synthetic Biology
Synthetic biology brings together concepts and techniques from engineering
and biology. In this field, computer-aided design (CAD) is necessary in order
to bridge the gap between computational modeling and biological data. An
application named TinkerCell has been created in order to serve as a CAD tool
for synthetic biology. TinkerCell is a visual modeling tool that supports a
hierarchy of biological parts. Each part in this hierarchy consists of a set of
attributes that define the part, such as sequence or rate constants. Models
that are constructed using these parts can be analyzed using various C and
Python programs that are hosted by TinkerCell via an extensive C and Python
API. TinkerCell supports the notion of a module, which are networks with
interfaces. Such modules can be connected to each other, forming larger modular
networks. Because TinkerCell associates parameters and equations in a model
with their respective part, parts can be loaded from databases along with their
parameters and rate equations. The modular network design can be used to
exchange modules as well as test the concept of modularity in biological
systems. The flexible modeling framework along with the C and Python API allows
TinkerCell to serve as a host to numerous third-party algorithms. TinkerCell is
a free and open-source project under the Berkeley Software Distribution
license. Downloads, documentation, and tutorials are available at
www.tinkercell.com.Comment: 23 pages, 20 figure
The Evolutionary Unfolding of Complexity
We analyze the population dynamics of a broad class of fitness functions that
exhibit epochal evolution---a dynamical behavior, commonly observed in both
natural and artificial evolutionary processes, in which long periods of stasis
in an evolving population are punctuated by sudden bursts of change. Our
approach---statistical dynamics---combines methods from both statistical
mechanics and dynamical systems theory in a way that offers an alternative to
current ``landscape'' models of evolutionary optimization. We describe the
population dynamics on the macroscopic level of fitness classes or phenotype
subbasins, while averaging out the genotypic variation that is consistent with
a macroscopic state. Metastability in epochal evolution occurs solely at the
macroscopic level of the fitness distribution. While a balance between
selection and mutation maintains a quasistationary distribution of fitness,
individuals diffuse randomly through selectively neutral subbasins in genotype
space. Sudden innovations occur when, through this diffusion, a genotypic
portal is discovered that connects to a new subbasin of higher fitness
genotypes. In this way, we identify innovations with the unfolding and
stabilization of a new dimension in the macroscopic state space. The
architectural view of subbasins and portals in genotype space clarifies how
frozen accidents and the resulting phenotypic constraints guide the evolution
to higher complexity.Comment: 28 pages, 5 figure
- …