18,206 research outputs found

    A complex adaptive systems approach to the kinetic folding of RNA

    Full text link
    The kinetic folding of RNA sequences into secondary structures is modeled as a complex adaptive system, the components of which are possible RNA structural rearrangements (SRs) and their associated bases and base pairs. RNA bases and base pairs engage in local stacking interactions that determine the probabilities (or fitnesses) of possible SRs. Meanwhile, selection operates at the level of SRs; an autonomous stochastic process periodically (i.e., from one time step to another) selects a subset of possible SRs for realization based on the fitnesses of the SRs. Using examples based on selected natural and synthetic RNAs, the model is shown to qualitatively reproduce characteristic (nonlinear) RNA folding dynamics such as the attainment by RNAs of alternative stable states. Possible applications of the model to the analysis of properties of fitness landscapes, and of the RNA sequence to structure mapping are discussed.Comment: 23 pages, 4 figures, 2 tables, to be published in BioSystems (Note: updated 2 references

    A Comparative Analysis of Ensemble Classifiers: Case Studies in Genomics

    Full text link
    The combination of multiple classifiers using ensemble methods is increasingly important for making progress in a variety of difficult prediction problems. We present a comparative analysis of several ensemble methods through two case studies in genomics, namely the prediction of genetic interactions and protein functions, to demonstrate their efficacy on real-world datasets and draw useful conclusions about their behavior. These methods include simple aggregation, meta-learning, cluster-based meta-learning, and ensemble selection using heterogeneous classifiers trained on resampled data to improve the diversity of their predictions. We present a detailed analysis of these methods across 4 genomics datasets and find the best of these methods offer statistically significant improvements over the state of the art in their respective domains. In addition, we establish a novel connection between ensemble selection and meta-learning, demonstrating how both of these disparate methods establish a balance between ensemble diversity and performance.Comment: 10 pages, 3 figures, 8 tables, to appear in Proceedings of the 2013 International Conference on Data Minin

    Stacked Penalized Logistic Regression for Selecting Views in Multi-View Learning

    Full text link
    In biomedical research, many different types of patient data can be collected, such as various types of omics data and medical imaging modalities. Applying multi-view learning to these different sources of information can increase the accuracy of medical classification models compared with single-view procedures. However, collecting biomedical data can be expensive and/or burdening for patients, so that it is important to reduce the amount of required data collection. It is therefore necessary to develop multi-view learning methods which can accurately identify those views that are most important for prediction. In recent years, several biomedical studies have used an approach known as multi-view stacking (MVS), where a model is trained on each view separately and the resulting predictions are combined through stacking. In these studies, MVS has been shown to increase classification accuracy. However, the MVS framework can also be used for selecting a subset of important views. To study the view selection potential of MVS, we develop a special case called stacked penalized logistic regression (StaPLR). Compared with existing view-selection methods, StaPLR can make use of faster optimization algorithms and is easily parallelized. We show that nonnegativity constraints on the parameters of the function which combines the views play an important role in preventing unimportant views from entering the model. We investigate the performance of StaPLR through simulations, and consider two real data examples. We compare the performance of StaPLR with an existing view selection method called the group lasso and observe that, in terms of view selection, StaPLR is often more conservative and has a consistently lower false positive rate.Comment: 26 pages, 9 figures. Accepted manuscrip

    Societal Costs of Late Blight in Potato and Prospects of Durable Resistance Through Cisgenic Modification

    Get PDF
    In the European Union almost 6 Mha of potatoes are grown representing a value of close to ¿6,000,000,000. Late blight caused by Phytophthora infestans causes annual losses (costs of control and damage) estimated at more than ¿1,000,000,000. Chemical control is under pressure as late blight becomes increasingly aggressive and there is societal resistance against the use of environmentally unfriendly chemicals. Breeding programmes have not been able to markedly increase the level of resistance of current potato varieties. New scientific approaches may yield genetically modified marker-free potato varieties (either trans- and/or cisgenic, the latter signifying the use of indigenous resistance genes) as improved variants of currently used varieties showing far greater levels of resistance. There are strong scientific investments needed to develop such improved varieties but these varieties will have great economic and environmental impact. Here we present an approach, based on (cisgenic) resistance genes that will enhance the impact. It consists of five themes: the detection of R-genes in the wild potato gene pool and their function related to the various aspects in the infection route and reproduction of the late blight causing pathogen; cloning of natural R-genes and transforming cassettes of single or multiple (cisgenic) R-genes into existing varieties with proven adaptation to improve their value for consumers; selection of true to the wild type and resistant genotypes with similar qualities as the original variety; spatial and temporal resistance management research of late blight of the cisgenic genetically modified (GM) varieties that contain different cassettes of R-genes to avoid breaking of resistance and reduce build-up of epidemics; communication and interaction with all relevant stakeholders in society and transparency in what research is doing. One of the main challenges is to explain the different nature and possible biological improvement and legislative repercussions of cisgenic GM-crops in comparison with transgenic GM-crops. It is important to realize that the present EU Directive 2001/18/EC on GM crops does not make a difference between trans- and cisgenes. These rules were developed when only transgenic GM plants were around. We present a case arguing for an updating and refinement of these rules in order to place cisgenic GM-crops in another class of GM-plants as has been done in the past with (induced) mutation breeding and the use of protoplast fusion between crossable species. Keywords Cisgenesis - Cloning - Communication - Late blight - Phytophthora infestans - Potato - Resistance management - Selection - Transformatio

    Ab initio RNA folding

    Full text link
    RNA molecules are essential cellular machines performing a wide variety of functions for which a specific three-dimensional structure is required. Over the last several years, experimental determination of RNA structures through X-ray crystallography and NMR seems to have reached a plateau in the number of structures resolved each year, but as more and more RNA sequences are being discovered, need for structure prediction tools to complement experimental data is strong. Theoretical approaches to RNA folding have been developed since the late nineties when the first algorithms for secondary structure prediction appeared. Over the last 10 years a number of prediction methods for 3D structures have been developed, first based on bioinformatics and data-mining, and more recently based on a coarse-grained physical representation of the systems. In this review we are going to present the challenges of RNA structure prediction and the main ideas behind bioinformatic approaches and physics-based approaches. We will focus on the description of the more recent physics-based phenomenological models and on how they are built to include the specificity of the interactions of RNA bases, whose role is critical in folding. Through examples from different models, we will point out the strengths of physics-based approaches, which are able not only to predict equilibrium structures, but also to investigate dynamical and thermodynamical behavior, and the open challenges to include more key interactions ruling RNA folding.Comment: 28 pages, 18 figure

    Physisorption of Nucleobases on Graphene

    Get PDF
    We report the results of our first-principles investigation on the interaction of the nucleobases adenine (A), cytosine (C), guanine (G), thymine (T), and uracil (U) with graphene, carried out within the density functional theory framework, with additional calculations utilizing Hartree--Fock plus second-order Moeller-Plesset perturbation theory. The calculated binding energy of the nucleobases shows the following hierarchy: G > T ~ C ~ A > U, with the equilibrium configuration being very similar for all five of them. Our results clearly demonstrate that the nucleobases exhibit significantly different interaction strengths when physisorbed on graphene. The stabilizing factor in the interaction between the base molecule and graphene sheet is dominated by the molecular polarizability that allows a weakly attractive dispersion force to be induced between them. The present study represents a significant step towards a first-principles understanding of how the base sequence of DNA can affect its interaction with carbon nanotubes, as observed experimentally.Comment: 7 pages, 3 figure
    • …
    corecore