18,206 research outputs found
A complex adaptive systems approach to the kinetic folding of RNA
The kinetic folding of RNA sequences into secondary structures is modeled as
a complex adaptive system, the components of which are possible RNA structural
rearrangements (SRs) and their associated bases and base pairs. RNA bases and
base pairs engage in local stacking interactions that determine the
probabilities (or fitnesses) of possible SRs. Meanwhile, selection operates at
the level of SRs; an autonomous stochastic process periodically (i.e., from one
time step to another) selects a subset of possible SRs for realization based on
the fitnesses of the SRs. Using examples based on selected natural and
synthetic RNAs, the model is shown to qualitatively reproduce characteristic
(nonlinear) RNA folding dynamics such as the attainment by RNAs of alternative
stable states. Possible applications of the model to the analysis of properties
of fitness landscapes, and of the RNA sequence to structure mapping are
discussed.Comment: 23 pages, 4 figures, 2 tables, to be published in BioSystems (Note:
updated 2 references
A Comparative Analysis of Ensemble Classifiers: Case Studies in Genomics
The combination of multiple classifiers using ensemble methods is
increasingly important for making progress in a variety of difficult prediction
problems. We present a comparative analysis of several ensemble methods through
two case studies in genomics, namely the prediction of genetic interactions and
protein functions, to demonstrate their efficacy on real-world datasets and
draw useful conclusions about their behavior. These methods include simple
aggregation, meta-learning, cluster-based meta-learning, and ensemble selection
using heterogeneous classifiers trained on resampled data to improve the
diversity of their predictions. We present a detailed analysis of these methods
across 4 genomics datasets and find the best of these methods offer
statistically significant improvements over the state of the art in their
respective domains. In addition, we establish a novel connection between
ensemble selection and meta-learning, demonstrating how both of these disparate
methods establish a balance between ensemble diversity and performance.Comment: 10 pages, 3 figures, 8 tables, to appear in Proceedings of the 2013
International Conference on Data Minin
Stacked Penalized Logistic Regression for Selecting Views in Multi-View Learning
In biomedical research, many different types of patient data can be
collected, such as various types of omics data and medical imaging modalities.
Applying multi-view learning to these different sources of information can
increase the accuracy of medical classification models compared with
single-view procedures. However, collecting biomedical data can be expensive
and/or burdening for patients, so that it is important to reduce the amount of
required data collection. It is therefore necessary to develop multi-view
learning methods which can accurately identify those views that are most
important for prediction. In recent years, several biomedical studies have used
an approach known as multi-view stacking (MVS), where a model is trained on
each view separately and the resulting predictions are combined through
stacking. In these studies, MVS has been shown to increase classification
accuracy. However, the MVS framework can also be used for selecting a subset of
important views. To study the view selection potential of MVS, we develop a
special case called stacked penalized logistic regression (StaPLR). Compared
with existing view-selection methods, StaPLR can make use of faster
optimization algorithms and is easily parallelized. We show that nonnegativity
constraints on the parameters of the function which combines the views play an
important role in preventing unimportant views from entering the model. We
investigate the performance of StaPLR through simulations, and consider two
real data examples. We compare the performance of StaPLR with an existing view
selection method called the group lasso and observe that, in terms of view
selection, StaPLR is often more conservative and has a consistently lower false
positive rate.Comment: 26 pages, 9 figures. Accepted manuscrip
Societal Costs of Late Blight in Potato and Prospects of Durable Resistance Through Cisgenic Modification
In the European Union almost 6 Mha of potatoes are grown representing a value of close to ¿6,000,000,000. Late blight caused by Phytophthora infestans causes annual losses (costs of control and damage) estimated at more than ¿1,000,000,000. Chemical control is under pressure as late blight becomes increasingly aggressive and there is societal resistance against the use of environmentally unfriendly chemicals. Breeding programmes have not been able to markedly increase the level of resistance of current potato varieties. New scientific approaches may yield genetically modified marker-free potato varieties (either trans- and/or cisgenic, the latter signifying the use of indigenous resistance genes) as improved variants of currently used varieties showing far greater levels of resistance. There are strong scientific investments needed to develop such improved varieties but these varieties will have great economic and environmental impact. Here we present an approach, based on (cisgenic) resistance genes that will enhance the impact. It consists of five themes: the detection of R-genes in the wild potato gene pool and their function related to the various aspects in the infection route and reproduction of the late blight causing pathogen; cloning of natural R-genes and transforming cassettes of single or multiple (cisgenic) R-genes into existing varieties with proven adaptation to improve their value for consumers; selection of true to the wild type and resistant genotypes with similar qualities as the original variety; spatial and temporal resistance management research of late blight of the cisgenic genetically modified (GM) varieties that contain different cassettes of R-genes to avoid breaking of resistance and reduce build-up of epidemics; communication and interaction with all relevant stakeholders in society and transparency in what research is doing. One of the main challenges is to explain the different nature and possible biological improvement and legislative repercussions of cisgenic GM-crops in comparison with transgenic GM-crops. It is important to realize that the present EU Directive 2001/18/EC on GM crops does not make a difference between trans- and cisgenes. These rules were developed when only transgenic GM plants were around. We present a case arguing for an updating and refinement of these rules in order to place cisgenic GM-crops in another class of GM-plants as has been done in the past with (induced) mutation breeding and the use of protoplast fusion between crossable species. Keywords Cisgenesis - Cloning - Communication - Late blight - Phytophthora infestans - Potato - Resistance management - Selection - Transformatio
Ab initio RNA folding
RNA molecules are essential cellular machines performing a wide variety of
functions for which a specific three-dimensional structure is required. Over
the last several years, experimental determination of RNA structures through
X-ray crystallography and NMR seems to have reached a plateau in the number of
structures resolved each year, but as more and more RNA sequences are being
discovered, need for structure prediction tools to complement experimental data
is strong. Theoretical approaches to RNA folding have been developed since the
late nineties when the first algorithms for secondary structure prediction
appeared. Over the last 10 years a number of prediction methods for 3D
structures have been developed, first based on bioinformatics and data-mining,
and more recently based on a coarse-grained physical representation of the
systems. In this review we are going to present the challenges of RNA structure
prediction and the main ideas behind bioinformatic approaches and physics-based
approaches. We will focus on the description of the more recent physics-based
phenomenological models and on how they are built to include the specificity of
the interactions of RNA bases, whose role is critical in folding. Through
examples from different models, we will point out the strengths of
physics-based approaches, which are able not only to predict equilibrium
structures, but also to investigate dynamical and thermodynamical behavior, and
the open challenges to include more key interactions ruling RNA folding.Comment: 28 pages, 18 figure
Physisorption of Nucleobases on Graphene
We report the results of our first-principles investigation on the
interaction of the nucleobases adenine (A), cytosine (C), guanine (G), thymine
(T), and uracil (U) with graphene, carried out within the density functional
theory framework, with additional calculations utilizing Hartree--Fock plus
second-order Moeller-Plesset perturbation theory. The calculated binding energy
of the nucleobases shows the following hierarchy: G > T ~ C ~ A > U, with the
equilibrium configuration being very similar for all five of them. Our results
clearly demonstrate that the nucleobases exhibit significantly different
interaction strengths when physisorbed on graphene. The stabilizing factor in
the interaction between the base molecule and graphene sheet is dominated by
the molecular polarizability that allows a weakly attractive dispersion force
to be induced between them. The present study represents a significant step
towards a first-principles understanding of how the base sequence of DNA can
affect its interaction with carbon nanotubes, as observed experimentally.Comment: 7 pages, 3 figure
- …