10,007 research outputs found

    Modeling And Identification Of Differentially Regulated Genes Using Transcriptomics And Proteomics Data

    Get PDF
    Photosynthetic organisms are complex dynamical systems, showing a remarkable ability to adapt to different environmental conditions for their survival. Mechanisms underlying the coordination between different cellular processes in these organisms are still poorly understood. In this dissertation we utilize various computational and modeling techniques to analyze transcriptomics and proteomics data sets from several photosynthetic organisms. We try to use changes in expression levels of genes to study responses of these organisms to various environmental conditions such as availability of nutrients, concentrations of chemicals in growth media, and temperature. Three specific problems studied here are transcriptomics modifications in photosynthetic organisms under reduction-oxidation: redox) stress conditions, circadian and diurnal rhythms of cyanobacteria and the effect of incident light patterns on these rhythms, and the coordination between biological processes in cyanobacteria under various growth conditions. Under redox stresses caused by high light treatments, a strong transcriptomic level response, spread across many biological processes, is discovered in the cyanobacterium Synechocystis sp. PCC 6803. Based on statistical tests, expression levels of about 20% of genes in Synechocystis 6803 are identified as significantly affected due to influence of high light. Gene clustering methods reveal that these responses can mainly be classified as transient and consistent responses, depending on the duration of modified behaviors. Many genes related to energy production as well as energy utilization are shown to be strongly affected. Analysis of microarray data under two stress conditions, high light and DCMU treatment, combined with data mining and motif finding algorithms led to a discovery of novel transcription factor, RRTF1 that responds to redox stresses in Arabidopsis thaliana. Time course transcriptomics data from Cyanothece sp. ATCC 51142 have shown strong diurnal rhythms. By combining multiple experimental conditions and using gene classification algorithms based on Fourier scores and angular distances, it is shown that majority of the diurnal genes are in fact light responding. Only about 10% of genes in the genome are categorized as being circadian controlled. A transcription control model based on dynamical systems is employed to identify the interactions between diurnal genes. A phase oscillator network is proposed to model the behavior of different biological processes. Both these models are shown to carry biologically meaningful features. To study the coordination between different biological processes to various environment and genetic modifications, an interaction model is derived using Bayesian network approach, combining all publicly available microarray data sets for Synechocystis sp. PCC 6803. Several novel relationships between biological processes are discovered from the model. Model is used to simulate several experimental conditions, and the response of the model is shown to agree with the experimentally observed behaviors

    The molecular underpinnings of neuronal cell identity in the stomatogastric ganglion of cancer borealis

    Get PDF
    Throughout the life of an organism, the nervous system must be able to balance changing in response to environmental stimuli with the need to produce reliable, repeatable activity patterns to create stereotyped behaviors. Understanding the mechanisms responsible for this regulation requires a wealth of knowledge about the neural system, ranging from network connectivity and cell type identification to intrinsic neuronal excitability and transcriptomic expression. To make strides in this area, we have employed the well-described stomatogastric nervous system of the Jonah crab Cancer borealis to examine the molecular underpinnings and regulation of neuron cell identity. Several crustacean circuits, including the stomatogastric nervous system and the cardiac ganglion, continue to provide important new insights into circuit dynamics and modulation (Diehl, White, Stein, & Nusbaum, 2013; Marder, 2012; Marder & Bucher, 2007; Williams et al., 2013), but this work has been partially hampered by the lack of extensive molecular sequence knowledge in crustaceans. Here we generated de novo transcriptome assembly from central nervous system tissue for C. borealis producing 42,766 contigs, focusing on an initial identification, curation, and comparison of genes that will have the most profound impact on our understanding of circuit function in these species. This included genes for 34 distinct ion channel types, 17 biogenic amine and 5 GABA receptors, 28 major transmitter receptor subtypes including glutamate and acetylcholine receptors, and 6 gap junction proteins -- the Innexins. ... With this reference transcriptome and annotated sequences in hand, we sought to determine the strengths and limitations of using the neuronal molecular profile to classify them into cell types. ... Since the resulting activity of a neuron is the product of the expression of ion channel genes, we sought to further probe the expression profile of neurons across a range of cell types to understand how these patterns of mRNA abundance relate to the properties of individual cell types. ... Finally, we sought to better understand the molecular underpinnings of how these correlated patterns of mRNA expression are generated and maintained.Includes bibliographical reference

    Identification of Yeast Transcriptional Regulation Networks Using Multivariate Random Forests

    Get PDF
    The recent availability of whole-genome scale data sets that investigate complementary and diverse aspects of transcriptional regulation has spawned an increased need for new and effective computational approaches to analyze and integrate these large scale assays. Here, we propose a novel algorithm, based on random forest methodology, to relate gene expression (as derived from expression microarrays) to sequence features residing in gene promoters (as derived from DNA motif data) and transcription factor binding to gene promoters (as derived from tiling microarrays). We extend the random forest approach to model a multivariate response as represented, for example, by time-course gene expression measures. An analysis of the multivariate random forest output reveals complex regulatory networks, which consist of cohesive, condition-dependent regulatory cliques. Each regulatory clique features homogeneous gene expression profiles and common motifs or synergistic motif groups. We apply our method to several yeast physiological processes: cell cycle, sporulation, and various stress conditions. Our technique displays excellent performance with regard to identifying known regulatory motifs, including high order interactions. In addition, we present evidence of the existence of an alternative MCB-binding pathway, which we confirm using data from two independent cell cycle studies and two other physioloigical processes. Finally, we have uncovered elaborate transcription regulation refinement mechanisms involving PAC and mRRPE motifs that govern essential rRNA processing. These include intriguing instances of differing motif dosages and differing combinatorial motif control that promote regulatory specificity in rRNA metabolism under differing physiological processes

    Bow-tie signaling in c-di-GMP: Machine learning in a simple biochemical network

    Get PDF
    Bacteria of many species rely on a simple molecule, the intracellular secondary messenger c-di-GMP (Bis-(3'-5')-cyclic dimeric guanosine monophosphate), to make a vital choice: whether to stay in one place and form a biofilm, or to leave it in search of better conditions. The c-di-GMP network has a bow-tie shaped architecture that integrates many signals from the outside world—the input stimuli—into intracellular c-di-GMP levels that then regulate genes for biofilm formation or for swarming motility—the output phenotypes. How does the ‘uninformed’ process of evolution produce a network with the right input/output association and enable bacteria to make the right choice? Inspired by new data from 28 clinical isolates of Pseudomonas aeruginosa and strains evolved in laboratory experiments we propose a mathematical model where the c-di-GMP network is analogous to a machine learning classifier. The analogy immediately suggests a mechanism for learning through evolution: adaptation though incremental changes in c-di-GMP network proteins acquires knowledge from past experiences and enables bacteria to use it to direct future behaviors. Our model clarifies the elusive function of the ubiquitous c-di-GMP network, a key regulator of bacterial social traits associated with virulence. More broadly, the link between evolution and machine learning can help explain how natural selection across fluctuating environments produces networks that enable living organisms to make sophisticated decisions

    Artificial Ontogenies: A Computational Model of the Control and Evolution of Development

    No full text
    Understanding the behaviour of biological systems is a challenging task. Gene regulation, development and evolution are each a product of nonlinear interactions between many individual agents: genes, cells or organisms. Moreover, these three processes are not isolated, but interact with one another in an important fashion. The development of an organism involves complex patterns of dynamic behaviour at the genetic level. The gene networks that produce this behaviour are subject to mutations that can alter the course of development, resulting in the production of novel morphologies. Evolution occurs when these novel morphologies are favoured by natural selection and survive to pass on their genes to future generations. Computational models can assist us to understand biological systems by providing a framework within which their behaviour can be explored. Many natural processes, including gene regulation and development, have a computational element to their control. Constructing formal models of these systems enables their behaviour to be simulated, observed and quantified on a scale not otherwise feasible. This thesis uses a computational simulation methodology to explore the relationship between development and evolution. An important question in evolutionary biology is how to explain the direction of evolution. Conventional explanations of evolutionary history have focused on the role of natural selection in orienting evolution. More recently, it has been argued that the nature of development, and the way it changes in response to mutation, may also be a significant factor. A network-lineage model of artificial ontogenies is described that incorporates a developmental mapping between the dynamics of a gene network and a cell lineage representation of a phenotype. Three series of simulation studies are reported, exploring: (a) the relationship between the structure of a gene network and its dynamic behaviour; (b) the characteristic distributions of ontogenies and phenotypes generated by the dynamics of gene networks; (c) the effect of these characteristic distributions on the evolution of ontogeny. The results of these studies indicate that the model networks are capable of generating a diverse range of stable behaviours, and possess a small yet significant sensitivity to perturbation. In the context of developmental control, the intrinsic dynamics of the model networks predispose the production of ontogenies with a modular, quasi-systematic structure. This predisposition is reflected in the structure of variation available for selection in an adaptive search process, resulting in the evolution of ontogenies biased towards simplicity. These results suggest a possible explanation for the levels of ontogenetic complexity observed in biological organisms: that they may be a product of the network architecture of developmental control. By quantifying complexity, variation and bias, the network-lineage model described in this thesis provides a computational method for investigating the effects of development on the direction of evolution. In doing so, it establishes a viable framework for simulating computational aspects of complex biological systems

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa
    • …
    corecore