54 research outputs found

    Prioritizing bona fide bacterial small RNAs with machine learning classifiers

    Get PDF
    Bacterial small (sRNAs) are involved in the control of several cellular processes. Hundreds of putative sRNAs have been identified in many bacterial species through RNA sequencing. The existence of putative sRNAs is usually validated by Northern blot analysis. However, the large amount of novel putative sRNAs reported in the literature makes it impractical to validate each of them in the wet lab. In this work, we applied five machine learning approaches to construct twenty models to discriminate bona fide sRNAs from random genomic sequences in five bacterial species. Sequences were represented using seven features including free energy of their predicted secondary structure, their distances to the closest predicted promoter site and Rho-independent terminator, and their distance to the closest open reading frames (ORFs). To automatically calculate these features, we developed an sRNA Characterization Pipeline (sRNACharP). All seven features used in the classification task contributed positively to the performance of the predictive models. The best performing model obtained a median precision of 100% at 10% recall and of 64% at 40% recall across all five bacterial species, and it outperformed previous published approaches on two benchmark datasets in terms of precision and recall. Our results indicate that even though there is limited sRNA sequence conservation across different bacterial species, there are intrinsic features in the genomic context of sRNAs that are conserved across taxa. We show that these features are utilized by machine learning approaches to learn a species-independent model to prioritize bona fide bacterial sRNAs

    Small RNA targets : advances in prediction tools and high-throughput profiling

    Get PDF
    MicroRNAs (miRNAs) are an abundant class of small non-coding RNAs that regulate gene expression at the post-transcriptional level. They are suggested to be involved in most biological processes of the cell primarily by targeting messenger RNAs (mRNAs) for cleavage or translational repression. Their binding to their target sites is mediated by the Argonaute (AGO) family of proteins. Thus, miRNA target prediction is pivotal for research and clinical applications. Moreover, transfer-RNA-derived fragments (tRFs) and other types of small RNAs have been found to be potent regulators of Ago-mediated gene expression. Their role in mRNA regulation is still to be fully elucidated, and advancements in the computational prediction of their targets are in their infancy. To shed light on these complex RNA–RNA interactions, the availability of good quality high-throughput data and reliable computational methods is of utmost importance. Even though the arsenal of computational approaches in the field has been enriched in the last decade, there is still a degree of discrepancy between the results they yield. This review offers an overview of the relevant advancements in the field of bioinformatics and machine learning and summarizes the key strategies utilized for small RNA target prediction. Furthermore, we report the recent development of high-throughput sequencing technologies, and explore the role of non-miRNA AGO driver sequences.peer-reviewe

    Modeling metabolism of Mycobacterium tuberculosis

    Get PDF
    Approximately one-fourth of the Mycobacterium tuberculosis (Mtb) genome contains genes that encode enzymes directly involved in its metabolism. These enzymes represent potential drug targets that can be systematically probed with constraint based (CB) models through the prediction of genes essential (or the combination thereof) for the pathogen to grow. However, gene essentiality depends on the growth conditions and, so far, no in vitro model precisely mimics the host at the different stages of mycobacterial infection, limiting model predictions. A first step in creating such a model is a thoroughly curated and extended genome-scale CB metabolic model of Mtb metabolism. The history of genome-scale CB models of Mtb metabolism up to model sMtb are discussed and sMtb is quantitatively validated using 13C measurements. The human pathogen Mtb has the capacity to escape eradication by professional phagocytes. During infection, Mtb resists the harsh environment of phagosomes and actively manipulates macrophages and dendritic cells to ensure prolonged intracellular survival. In contrast to many other intracellular pathogens, it has remained difficult to capture the transcriptome of mycobacteria during infection due to an unfavorable host-to-pathogen ratio. The human macrophage-like cell line THP-1 was infected with the attenuated Mtb surrogate Mycobacterium bovis Bacillus Calmette–Guérin (M. bovis BCG). Mycobacterial RNA was up to 1000-fold underrepresented in total RNA preparations of infected host cells. By combining microbial enrichment with specific ribosomal RNA depletion the transcriptional responses of host and pathogen during infection were simultaneously analyzed using dual RNA sequencing. Mycobacterial pathways for cholesterol degradation and iron acquisition are upregulated during infection. In addition, genes involved in the methylcitrate cycle, aspartate metabolism and recycling of mycolic acids are induced. In response to M. bovis BCG infection, host cells upregulate de novo cholesterol biosynthesis presumably to compensate for the loss of this metabolite by bacterial catabolism. By systematically probing the metabolic network underpinning sMtb, the reactions that are essential for Mtb are identified. A majority of these reactions are catalyzed by enzymes and thus represent candidate drug targets to fight an Mtb infection. Modeling the behavior of the bacteria during infection requires knowledge of the so-called biomass reaction that represents bacterial biomass composition. This composition varies in different environments or bacterial growth phases. Accurate modeling of all fluxes through metabolism under a given condition at a moment in time, the so called metabolic state, requires a precise description of the biomass reaction for the described condition. The transcript abundance data obtained by dual RNA sequencing was used to develop a straightforward and systematic method to obtain a condition-specific biomass reaction for Mtb during in vitro growth and during infection of its host. The method described herein is virtually free of any pre-set assumptions on uptake rates of nutrients, making it suitable for exploring environments with limited accessibility. The condition-specific biomass reaction represents the 'metabolic objective' of Mtb in a given environment (in-host growth and growth on defined medium) at a specific time point, and as such allows modeling the bacterial metabolic state in these environments. Five different biomass reactions were used predict nutrient uptake rates and gene essentiality. Predictions were subsequently compared to available experimental data. Nutrient uptake can accurately be predicted, but accurate gene essentiality predictions remain difficult to obtain. By combining sMtb and a model of human metabolism, model sMtb-RECON was developed and used to predict the metabolic state of Mtb during infection of the host. Amino acids are predicted to be used for energy production as well as biomass formation. Subsequently the effect of increasing dosages of drugs, targeting metabolism, on the metabolic state of the pathogen was assessed and resulting metabolic adaptations and flux rerouting through various pathways is predicted. In particular, the TCA cycle becomes more important upon drug application, as well as alanine, aspartate, glutamate, proline, arginine and porphyrin metabolism, while glycine, serine and threonine metabolism become less important for survival. Notably, an effect of eight out of eleven metabolically active drugs could be recreated and two major profiles of the metabolic state were predicted. The profiles of the metabolic states of Mtb affected by the drugs BTZ043, cycloserine and its derivative terizidone, ethambutol, ethionamide, propionamide, and isoniazid were very similar, while TMC207 is predicted to have quite a different effect on metabolism as it inhibits ATP synthase and therefore indirectly interferes with a multitude of metabolic pathways.</p

    Plant biosystems design research roadmap 1.0

    Get PDF
    Human life intimately depends on plants for food, biomaterials, health, energy, and a sustainable environment. Various plants have been genetically improved mostly through breeding, along with limited modification via genetic engineering, yet they are still not able to meet the ever-increasing needs, in terms of both quantity and quality, resulting from the rapid increase in world population and expected standards of living. A step change that may address these challenges would be to expand the potential of plants using biosystems design approaches. This represents a shift in plant science research from relatively simple trial-and-error approaches to innovative strategies based on predictive models of biological systems. Plant biosystems design seeks to accelerate plant genetic improvement using genome editing and genetic circuit engineering or create novel plant systems through de novo synthesis of plant genomes. From this perspective, we present a comprehensive roadmap of plant biosystems design covering theories, principles, and technical methods, along with potential applications in basic and applied plant biology research. We highlight current challenges, future opportunities, and research priorities, along with a framework for international collaboration, towards rapid advancement of this emerging interdisciplinary area of research. Finally, we discuss the importance of social responsibility in utilizing plant biosystems design and suggest strategies for improving public perception, trust, and acceptance

    Evolvability and rate of evolution in evolutionary computation

    Get PDF
    Evolvability has emerged as a research topic in both natural and computational evolution. It is a notion put forward to investigate the fundamental mechanisms that enable a system to evolve. A number of hypotheses have been proposed in modern biological research based on the examination of various mechanisms in the biosphere for their contribution to evolvability. Therefore, it is intriguing to try to transfer new discoveries from Biology to and test them in Evolutionary Computation (EC) systems, so that computational models would be improved and a better understanding of general evolutional mechanisms is achieved. -- Rate of evolution comes in different flavors in natural and computational evolution. Specifically, we distinguish the rate of fitness progression from that of genetic substitutions. The former is a common concept in EC since the ability to explicitly quantify the fitness of an evolutionary individual is one of the most important differences between computational systems and natural systems. Within the biological research community, the definition of rate of evolution varies, depending on the objects being examined such as gene sequences, proteins, tissues, etc. For instance, molecular biologists tend to use the rate of genetic substitutions to quantify how fast evolution proceeds at the genetic level. This concept of rate of evolution focuses on the evolutionary dynamics underlying fitness development, due to the inability to mathematically define fitness in a natural system. In EC, the rate of genetic substitutions suggests an unconventional and potentially powerful method to measure the rate of evolution by accessing lower levels of evolutionary dynamics. -- Central to this thesis is our new definition of rate of evolution in EC. We transfer the method of measurement of the rate of genetic substitutions from molecular biology to EC. The implementation in a Genetic Programming (GP) system shows that such measurements can indeed be performed and reflect well how evolution proceeds. Below the level of fitness development it provides observables at the genetic level of a GP population during evolution. We apply this measurement method to investigate the effects of four major configuration parameters in EC, i.e., mutation rate, crossover rate, tournament selection size, and population size, and show that some insights can be gained into the effectiveness of these parameters with respect to evolution acceleration. Further, we observe that population size plays an important role in determining the rate of evolution. We formulate a new indicator based on this rate of evolution measurement to adjust population size dynamically during evolution. Such a strategy can stabilize the rate of genetic substitutions and effectively improve the performance of a GP system over fixed-size populations. This rate of evolution measure also provides an avenue to study evolvability, since it captures how the two sides of evolvability, i.e., variability and neutrality, interact and cooperate with each other during evolution. We show that evolvability can be better understood in the light of this interplay and how this can be used to generate adaptive phenotypic variation via harnessing random genetic variation. The rate of evolution measure and the adaptive population size scheme are further transferred to a Genetic Algorithm (GA) to solve a real world application problem - the wireless network planning problem. Computer simulation of such an application proves that the adaptive population size scheme is able to improve a GA's performance against conventional fixed population size algorithms

    Computational Design and Experimental Validation of Functional Ribonucleic Acid Nanostructures

    Get PDF
    In living cells, two major classes of ribonucleic acid (RNA) molecules can be found. The first class called the messenger RNA (mRNA) contains the genetic information that allows the ribosome to read and translate it into proteins. The second class called non-coding RNA (ncRNA), do not code for proteins and are involved with key cellular processes, such as gene expression regulation, splicing, differentiation, and development. NcRNAs fold into an ensemble of thermodynamically stable secondary structures, which will eventually lead the molecule to fold into a specific 3D structure. It is widely known that ncRNAs carry their functions via their 3D structures as well as their molecular composition. The secondary structure of ncRNAs is composed of different types of structural elements (motifs) such as stacking base pairs, internal loops, hairpin loops and pseudoknots. Pseudoknots are specifically difficult to model, are abundant in nature and known to stabilize the functional form of the molecule. Due to the diverse range of functions of ncRNAs, their computational design and analysis have numerous applications in nano-technology, therapeutics, synthetic biology, and materials engineering. The RNA design problem is to find novel RNA sequences that are predicted to fold into target structure(s) while satisfying specific qualitative characteristics and constraints. RNA design can be modeled as a combinatorial optimization problem (COP) and is known to be computationally challenging or more precisely NP-hard. Numerous algorithms to solve the RNA design problem have been developed over the past two decades, however mostly ignore pseudoknots and therefore limit application to only a slice of real-world modeling and design problems. Moreover, the few existing pseudoknot designer methods which were developed only recently, do not provide any evidence about the applicability of their proposed design methodology in biological contexts. The two objectives of this thesis are set to address these two shortcomings. First, we are interested in developing an efficient computational method for the design of RNA secondary structures including pseudoknots that show significantly improved in-silico quality characteristics than the state of the art. Second, we are interested in showing the real-world worthiness of the proposed method by validating it experimentally. More precisely, our aim is to design instances of certain types of RNA enzymes (i.e. ribozymes) and demonstrate that they are functionally active. This would likely only happen if their predicted folding matched their actual folding in the in-vitro experiments. In this thesis, we present four contributions. First, we propose a novel adaptive defect weighted sampling algorithm to efficiently solve the RNA secondary structure design problem where pseudoknots are included. We compare the performance of our design algorithm with the state of the art and show that our method generates molecules that are thermodynamically more stable and less defective than those generated by state of the art methods. Moreover, we show when the effect of fitness evaluation is decoupled from the search and optimization process, our optimization method converges faster than the non-dominated sorting genetic algorithm (NSGA II) and the ant colony optimization (ACO) algorithm do. Second, we use our algorithmic development to implement an RNA design pipeline called Enzymer and make it available as an open source package useful for wet lab practitioners and RNA bioinformaticians. Enzymer uses multiple sequence alignment (MSA) data to generate initial design templates for further optimization. Our design pipeline can then be used to re-engineer naturally occurring RNA enzymes such as ribozymes and riboswitches. Our first and second contributions are published in the RNA section of the Journal of Frontiers in Genetics. Third, we use Enzymer to reengineer three different species of pseudoknotted ribozymes: a hammerhead ribozyme from the mouse gut metagenome, a hammerhead ribozyme from Yarrowia lipolytica and a glmS ribozyme from Thermoanaerobacter tengcogensis. We designed a total of 18 ribozyme sequences and showed the 16 of them were active in-vitro. Our experimental results have been submitted to the RNA journal and strongly suggest that Enzymer is a reliable tool to design pseudoknotted ncRNAs with desired secondary structure. Finally, we propose a novel architecture for a new ribozyme-based gene regulatory network where a hammerhead ribozyme modulates expression of a reporter gene when an external stimulus IPTG is present. Our in-vivo results show expected results in 7 out of 12 cases

    Towards the control of cell states in gene regulatory networks by evolving Boolean networks

    Get PDF
    Biological cell behaviours emerge from complex patterns of interactions between genes and their products, known as gene regulatory networks (GRNs). More specifically, GRNs are complex dynamical structures that orchestrate the activities of biological cells by governing the expression of mRNA and proteins. Many computational models of these networks have been shown to be able to carry out complex computation in an efficient and robust manner, particularly in the domains of control and signal processing. GRNs play a central role within living organisms and efficient strategies for controlling their dynamics need to be developed. For instance, the ability to push a cell towards or away from certain behaviours, is an important aim in fields such as medicine and synthetic biology. This could, for example, help to find novel approaches in the design of therapeutic drugs. However, current approaches to controlling these networks exhibit poor scalability and limited generality. This thesis proposes a new approach and an alternative method for performing state space targeting in GRNs, by coupling an artificial GRN to an existing GRN. This idea is tested in simulation by coupling together Boolean networks that represent controlled and controller systems. Evolutionary algorithms are used to evolve the controller Boolean networks. Controller Boolean networks are applied to a range of controlled Boolean networks including Boolean models of actual biological circuits, each with different dynamics. The results show that controller Boolean networks can be optimised to control trajectories in the target networks. Also, the approach scales well as the target network size increases. The use of Boolean modelling is potentially advantageous from an implementation perspective, since synthetic biology techniques can be used to refine an optimised controller Boolean network into an in vivo form, which could then control a genetic network directly from within a cell

    Measurement of Ribozyme Cleavage Reaction Using Toehold Mediated Strand Displacement; Design, Validation and Possible Applications

    Get PDF
    Non-coding RNAs or ncRNAs are RNA molecules that are not translated but play functional roles within cells. Some of these ncRNAs possess enzymatic properties. These molecules are termed as ribozymes. Ribozymes mainly catalyze nucleic acid strand scission reactions with or without the help of protein molecules. Ribozymes such as hammerhead ribozymes (HHRs) are known to mediate gene silencing and RNA processing. Single stranded RNA/DNA (ssRNA/DNA) inducible HHRs or tetracycline inducible aptazymes exist. Using these HHRs, different types of logic gates can be designed, activated by one or more inputs including ssDNA and ssRNA. Evaluating HHR kinetics of cleavage is essential to understand their mechanism, characterize HHR mutants and to properly estimate several parameters important to design RNA-based logic circuits. Firstly, we developed a novel methodology to detect HHR kinetics using toehold mediated strand displacement reaction (TMSDR). A probe composed of a fluorophore and a quencher was designed to measure the kinetics of HHR cleavage reactions without labelling RNA molecules, regular sampling or the utilization of polyacrylamide gels. This probe consists of two DNA strands; one strand labelled with a fluorophore at its 5′ end, while the other strand labelled with a quencher at its 3′ end. These two DNA strands are complementary, but the fluorophore strand is longer than the quencher strand at its 3′. The unpaired extra nucleotides act as toehold, which is utilized by a detached cleaved fragment, coming from a self-cleaving hammerhead ribozyme, as the starting point for the strand displacement reaction. This reaction will cause the separation of the fluorophore strand from the quencher strand, culminating in fluorescence detectable in a plate reader. This fluorescence is proportional to the amount of detached cleaved-off RNA strand displacing the DNA quencher strand. This method can be used to replace radio-hazardous unstable 32P as a means of measurement of the kinetics of ribozyme cleavage reactions; it also eliminates the need for use of polyacrylamide gels for the same purpose. Critically, this method allows experimenters to distinguish between the amount of cleaved ribozyme and the amount of detached cleaved-off fragments, resulting from the cleavage. Secondly, we developed doubler HHRs that cleave twice upon induction with a single input strand (ssDNA/ssRNA). Outputs can be heterogeneous (Hetero doubler) or identical (Homo doubler). Homo doublers were designed to work as amplifying components in RNA amplifiers. We showed two potential doubler HHRs from two different designs (First doubler and D1 doubler). In conclusion, we found that the concentration of detached cleaved-off fragments is relatively low and hence we developed homo-doublers to increase the concentration of cleaved-off fragments

    The FDA-Approved Drug Cobicistat Synergizes with Remdesivir To Inhibit SARS-CoV-2 Replication In Vitro and Decreases Viral Titers and Disease Progression in Syrian Hamsters

    Get PDF
    Combinations of direct-acting antivirals are needed to minimize drug resistance mutations and stably suppress replication of RNA viruses. Currently, there are limited therapeutic options against the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), and testing of a number of drug regimens has led to conflicting results. Here, we show that cobicistat, which is an FDA-approved drug booster that blocks the activity of the drug-metabolizing proteins cytochrome P450-3As (CYP3As) and P-glycoprotein (P-gp), inhibits SARS-CoV-2 replication. Two independent cell-to-cell membrane fusion assays showed that the antiviral effect of cobicistat is exerted through inhibition of spike protein-mediated membrane fusion. In line with this, incubation with low-micromolar concentrations of cobicistat decreased viral replication in three different cell lines including cells of lung and gut origin. When cobicistat was used in combination with remdesivir, a synergistic effect on the inhibition of viral replication was observed in cell lines and in a primary human colon organoid. This was consistent with the effects of cobicistat on two of its known targets, CYP3A4 and P-gp, the silencing of which boosted the in vitro antiviral activity of remdesivir in a cobicistat-like manner. When administered in vivo to Syrian hamsters at a high dose, cobicistat decreased viral load and mitigated clinical progression. These data highlight cobicistat as a therapeutic candidate for treating SARS-CoV-2 infection and as a potential building block of combination therapies for COVID-19
    corecore