280 research outputs found

    Statistics and Evolution of Functional Genomic Sequence

    Get PDF
    In this thesis, three separate problems of genomics are addressed, utilizing methods related to the field of statistical mechanics. The goal of the project discussed in the first chapter is the elucidation of post-transcriptional gene regulation imposed by microRNAs, a recently discovered class of tiny non-coding RNAs. A probabilistic algorithm for the computational identification of genes regulated by microRNAs is introduced, which was developed based on experimental data and statistical analysis of whole genome data. In particular, the application of this algorithm to multiple-alignments of groups of related species allows for the specific and sensitive detection of genes targeted by microRNAs on a genome-wide level. Examination of clade-specific predictions and cross-clade comparison yields deeper insights into microRNA biology and first clues about long-term evolution of microRNA regulation, which are discussed in detail. Modeling evolutionary dynamics of microsatellites, an abundant class of repetitive sequence in eukaryotic genomes, was the objective of the second project and is discussed in chapter two. Inspired by the putative functionality of some of these elements and the difficulty of constructing correct sequence alignments that reflect the evolutionary relationships between microsatellites, a neutral model for microsatellite evolution is developed and tested in the fruit fly Drosophila melanogaster by comparing evolutionary rates predicted by the model to independent measurements of these rates from multiple alignments of three closely relates Drosophila species. The model is applied separately to genomic sequence categories of different functional annotations in order to assess the varying influence of selective constraint among these categories. In the last chapter, a general population genetic model is introduced that allows for the determination of transcription factor binding site stability as a function of selection strength, mutation rate and effective population size at arbitrary values of these parameters. The analytical solution of this model indicates the probability of a binding site to be functional. The model is used to compute the population fraction of functional binding sites at fixed selection pressure across a variety of different taxa. The results lead to the conclusion that a decreasing effective population size, such as observed at the evolutionary transition from prokaryotes to eukaryotes, could result in loss of binding site stability. An extension to our model serves us to assess the compensatory effect of the emergence of multiple binding sites for the same transcription factor in order to maintain the existing regulatory relationship

    PLoS One

    No full text

    Haloquadratum walsbyi : Limited Diversity in a Global Pond

    Get PDF
    BACKGROUND: Haloquadratum walsbyi commonly dominates the microbial flora of hypersaline waters. Its cells are extremely fragile squares requiring >14%(w/v) salt for growth, properties that should limit its dispersal and promote geographical isolation and divergence. To assess this, the genome sequences of two isolates recovered from sites at near maximum distance on Earth, were compared. PRINCIPAL FINDINGS: Both chromosomes are 3.1 MB in size, and 84% of each sequence was highly similar to the other (98.6% identity), comprising the core sequence. ORFs of this shared sequence were completely synteneic (conserved in genomic orientation and order), without inversion or rearrangement. Strain-specific insertions/deletions could be precisely mapped, often allowing the genetic events to be inferred. Many inferred deletions were associated with short direct repeats (4-20 bp). Deletion-coupled insertions are frequent, producing different sequences at identical positions. In cases where the inserted and deleted sequences are homologous, this leads to variant genes in a common synteneic background (as already described by others). Cas/CRISPR systems are present in C23(T) but have been lost in HBSQ001 except for a few spacer remnants. Numerous types of mobile genetic elements occur in both strains, most of which appear to be active, and with some specifically targetting others. Strain C23(T) carries two ∼6 kb plasmids that show similarity to halovirus His1 and to sequences nearby halovirus/plasmid gene clusters commonly found in haloarchaea. CONCLUSIONS: Deletion-coupled insertions show that Hqr. walsbyi evolves by uptake and precise integration of foreign DNA, probably originating from close relatives. Change is also driven by mobile genetic elements but these do not by themselves explain the atypically low gene coding density found in this species. The remarkable genome conservation despite the presence of active systems for genome rearrangement implies both an efficient global dispersal system, and a high selective fitness for this species

    From Genes to Ecosystems: Resource Availability and DNA Methylation Drive the Diversity and Abundance of Restriction Modification Systems in Prokaryotes

    Get PDF
    Together, prokaryotic hosts and their viruses numerically dominate the planet and are engaged in an eternal struggle of hosts evading viral predation and viruses overcoming defensive mechanisms employed by their hosts. Prokaryotic hosts have been found to carry several viral defense systems in recent years with Restriction Modification systems (RMs) were the first discovered in the 1950s. While we have biochemically elucidated many of these systems in the last 70 years, we still struggle to understand what drives their gain and loss in prokaryotic genomes. In this work, we take a computational approach to understand the underlying evolutionary drivers of RMs by assessing ‘big data’ signals of RMs in prokaryotic genomes and incorporating molecular data in trait-based mathematical models. Focusing on the Cyanobacteria, we found a large discrepancy in the frequency of RMs per genome in different environmental contexts, where Cyanobacteria that live in oligotrophic nutrient conditions have few to no RMs and those in nutrient-rich conditions consistently have many RMs. While our models agree with the observation that increased nutrient inputs make the selective pressure of RMs more intense, they were unable to reconcile the high numbers of RMs per genome with their potent defensive properties- a situation of apparent overkill. By incorporating viral methylation, an unavoidable effect of RMs, we were able to explain how organisms could carry over 15 RMs. With this discovery, we then tried and reassess the distribution of methyltransferases, an essential component of RMs that can also have alternate physiological rolls in the cell. We expand on conventional wisdom, that methyltransferases that are widely phylogenetically conserved are associated with global cellular regulation. However, we also find that organisms with high numbers of RMs also have a surprising amount of conservation in the methyltransferases that they carry. This data suggests caution should be used in associating phylogenic signals with functional rolls in methyltransferases as different functional rolls seem to overlap in their phylogenetic signal. Indeed, we suggest trait-based modeling may be the best tool in elucidating why organisms with a high selective pressure to maintain RMs appear to have conserved methyltransferase

    Autopolyploid inheritance and a heterozygous reciprocal translocation shape chromosome genetic behavior in tetraploid blueberry (Vaccinium corymbosum)

    Get PDF
    Understanding chromosome recombination behavior in polyploidy species is key to advancing genetic discoveries. In blueberry, a tetraploid species, the line of evidences about its genetic behavior still remain poorly understood, owing to the inter-specific, and inter-ploidy admixture of its genome and lack of in depth genome-wide inheritance and comparative structural studies. Here we describe a new high-quality, phased, chromosome-scale genome of a diploid blueberry, clone W85. The genome was integrated with cytogenetics and high-density, genetic maps representing six tetraploid blueberry cultivars, harboring different levels of wild genome admixture, to uncover recombination behavior and structural genome divergence across tetraploid and wild diploid species. Analysis of chromosome inheritance and pairing demonstrated that tetraploid blueberry behaves as an autotetraploid with tetrasomic inheritance. Comparative analysis demonstrated the presence of a reciprocal, heterozygous, translocation spanning one homolog of chr-6 and one of chr-10 in the cultivar Draper. The translocation affects pairing and recombination of chromosomes 6 and 10. Besides the translocation detected in Draper, no other structural genomic divergences were detected across tetraploid cultivars and highly inter-crossable wild diploid species. These findings and resources will facilitate new genetic and comparative genomic studies in Vaccinium and the development of genomic assisted selection strategy for this cro

    BAC transgene arrays as a model system for studying large-scale chromatin structure

    Get PDF
    The folding of interphase chromatin into large-scale chromatin structure and its spatial organization within nucleus has been suggested to have important roles in gene regulation. In this study, we created engineered chromatin regions consisting of tandem repeats of BAC transgenes, which contain 150-200 kb of defined genomic regions, and used them as a model system to study the mechanisms and functional significance of large-scale chromatin organization. The BAC transgene arrays recapitulated several important features of endogenous chromatin, including transcription level and intranuclear positioning. Using this system, we showed that tandem arrays of housekeeping gene loci form open large-scale chromatin structure independent of their genomic integration sites, including insertions within centromeric heterochromatin. This BAC-specific large-scale chromatin conformation provided a permissive environment for transcription, as evidenced by the copy-number dependent and position independent expression of embedded reporter mini-genes. This leads to the development of a novel method for reliable transgene expression in mammalian cells, which should prove useful in a number of therapeutic and scientific applications. We also demonstrated that BAC transgene arrays can be employed as an effective system for dissecting sequence determinants for intranuclear positioning of gene loci. We showed that in mouse ES and fibroblast cells a BAC carrying a 200 kb human genomic fragment containing the beta-globin locus autonomously targets to the nuclear periphery. Using BAC recombineering, we dissected this 200kb region and identified two genomic regions sufficient to target the BAC transgenes to nuclear periphery. This study represents a first step towards elucidation of the molecular mechanism for the nuclear peripheral localization of genes in mammalian cells

    Design, synthesis and characterization of the synthetic yeast genome

    Get PDF
    With the rapid development of DNA synthesis technologies, synthetic biology has made tremendous progress in the past 15 years, in particular for synthetic genomics. Synthetic genomics is a nascent field of synthetic biology, which aims to design new biological systems/organisms to satisfy human needs. Conventional synthetic biology focuses on the redesign, construction and modeling of biological parts, pathways or genomes that do not exist in nature, while synthetic genomics encompasses technologies that allow the generation of chemically synthesized larger parts of genomes or whole genomes, with simultaneous redesign of an organism’s genetic material. Synthetic genomics is painting a blueprint for a new era of biology and holds great potential for a multitude of applications, such as pharmaceuticals, biofuels and rapid generation of vaccines against emerging diseases. Chapter One gives an introduction of the current state of the art and challenges of synthetic genomics and the objectives of this study. Chapter Two demonstrates the design and construction strategy of two megabase-long synthetic yeast chromosomes, SynII and SynVII. Chapter Three describes the full characterization of SynII and SynVII. Chapter Four introduces the SCRaMbLE (Synthetic Chromosome Rearrangement and Modification by LoxPsym-mediated Evolution) system and its application in SynII and SynVII. Taken together, this work demonstrates the utility of synthetic yeast for understanding biological systems and its potential for industrial applications
    • …
    corecore