20 research outputs found

    Error-driven adaptive resolutions for large scientific data sets

    Get PDF
    The process of making observations and drawing conclusions from large data sets is an essential part of modern scientific research. However, the size of these data sets can easily exceed the available resources of a typical workstation, making visualization and analysis a formidable challenge. Many solutions, including multiresolution and adaptive resolution representations, have been proposed and implemented to address these problems. This thesis describes an error model for calculating and representing localized error from data reduction and a process for constructing error-driven adaptive resolutions from this data, allowing fully-renderable error driven adaptive resolutions to be constructed from a single, high-resolution data set. We evaluated the performance of these adaptive resolutions generated with various parameters compared to the original data set. We found that adaptive resolutions generated with reasonable subdomain sizes and error tolerances show improved performance daring visualization

    Why genes evolve faster on secondary chromosomes in bacteria

    Get PDF
    In bacterial genomes composed of more than one chromosome, one replicon is typically larger, harbors more essential genes than the others, and is considered primary. The greater variability of secondary chromosomes among related taxa has led to the theory that they serve as an accessory genome for specific niches or conditions. By this rationale, purifying selection should be weaker on genes on secondary chromosomes because of their reduced necessity or usage. To test this hypothesis we selected bacterial genomes composed of multiple chromosomes from two genera, Burkholderia and Vibrio, and quantified the evolutionary rates (dN and dS) of all orthologs within each genus. Both evolutionary rate parameters were faster among orthologs found on secondary chromosomes than those on the primary chromosome. Further, in every bacterial genome with multiple chromosomes that we studied, genes on secondary chromosomes exhibited significantly weaker codon usage bias than those on primary chromosomes. Faster evolution and reduced codon bias could in turn result from global effects of chromosome position, as genes on secondary chromosomes experience reduced dosage and expression due to their delayed replication, or selection on specific gene attributes. These alternatives were evaluated using orthologs common to genomes with multiple chromosomes and genomes with single chromosomes. Analysis of these ortholog sets suggested that inherently fast-evolving genes tend to be sorted to secondary chromosomes when they arise; however, prolonged evolution on a secondary chromosome further accelerated substitution rates. In summary, secondary chromosomes in bacteria are evolutionary test beds where genes are weakly preserved and evolve more rapidly, likely because they are used less frequently

    Evolutionary rates and gene dispensability associate with replication timing in the Archaeon Sulfolobus islandicus

    Get PDF
    In bacterial chromosomes, the position of a gene relative to the single origin of replication generally reflects its replication timing, how often it is expressed, and consequently, its rate of evolution. However, because some archaeal genomes contain multiple origins of replication, bias in gene dosage caused by delayed replication should be minimized and hence the substitution rate of genes should associate less with chromosome position. To test this hypothesis, six archaeal genomes from the genus Sulfolobus containing three origins of replication were selected, conserved orthologs were identified, and the evolutionary rates (dN and dS) of these orthologs were quantified. Ortholog families were grouped by their consensus position and designated by their proximity to one of the three origins (O1, O2, O3). Conserved orthologs were concentrated near the origins and most variation in genome content occurred distant from the origins. Linear regressions of both synonymous and nonsynonymous substitution rates on distance from replication origins were significantly positive, the rates being greatest in the region furthest from any of the origins and slowest among genes near the origins. Genes near O1 also evolved faster than those near O2 and O3, which suggest that this origin may fire later in the cell cycle. Increased evolutionary rates and gene dispensability are strongly associated with reduced gene expression caused in part by reduced gene dosage during the cell cycle. Therefore, in this genus of Archaea as well as in many Bacteria, evolutionary rates and variation in genome content associate with replication timing

    Why Genes Evolve Faster on Secondary Chromosomes in Bacteria

    Get PDF
    In bacterial genomes composed of more than one chromosome, one replicon is typically larger, harbors more essential genes than the others, and is considered primary. The greater variability of secondary chromosomes among related taxa has led to the theory that they serve as an accessory genome for specific niches or conditions. By this rationale, purifying selection should be weaker on genes on secondary chromosomes because of their reduced necessity or usage. To test this hypothesis we selected bacterial genomes composed of multiple chromosomes from two genera, Burkholderia and Vibrio, and quantified the evolutionary rates (dN and dS) of all orthologs within each genus. Both evolutionary rate parameters were faster among orthologs found on secondary chromosomes than those on the primary chromosome. Further, in every bacterial genome with multiple chromosomes that we studied, genes on secondary chromosomes exhibited significantly weaker codon usage bias than those on primary chromosomes. Faster evolution and reduced codon bias could in turn result from global effects of chromosome position, as genes on secondary chromosomes experience reduced dosage and expression due to their delayed replication, or selection on specific gene attributes. These alternatives were evaluated using orthologs common to genomes with multiple chromosomes and genomes with single chromosomes. Analysis of these ortholog sets suggested that inherently fast-evolving genes tend to be sorted to secondary chromosomes when they arise; however, prolonged evolution on a secondary chromosome further accelerated substitution rates. In summary, secondary chromosomes in bacteria are evolutionary test beds where genes are weakly preserved and evolve more rapidly, likely because they are used less frequently

    Evolutionary Rates and Gene Dispensability Associate with Replication Timing in the Archaeon Sulfolobus islandicus

    Get PDF
    In bacterial chromosomes, the position of a gene relative to the single origin of replication generally reflects its replication timing, how often it is expressed, and consequently, its rate of evolution. However, because some archaeal genomes contain multiple origins of replication, bias in gene dosage caused by delayed replication should be minimized and hence the substitution rate of genes should associate less with chromosome position. To test this hypothesis, six archaeal genomes from the genus Sulfolobus containing three origins of replication were selected, conserved orthologs were identified, and the evolutionary rates (dN and dS) of these orthologs were quantified. Ortholog families were grouped by their consensus position and designated by their proximity to one of the three origins (O1, O2, O3). Conserved orthologs were concentrated near the origins and most variation in genome content occurred distant from the origins. Linear regressions of both synonymous and nonsynonymous substitution rates on distance from replication origins were significantly positive, the rates being greatest in the region furthest from any of the origins and slowest among genes near the origins. Genes near O1 also evolved faster than those near O2 and O3, which suggest that this origin may fire later in the cell cycle. Increased evolutionary rates and gene dispensability are strongly associated with reduced gene expression caused in part by reduced gene dosage during the cell cycle. Therefore, in this genus of Archaea as well as in many Bacteria, evolutionary rates and variation in genome content associate with replication timing

    Natural selection shaped the rise and fall of passenger pigeon genomic diversity.

    Get PDF
    The extinct passenger pigeon was once the most abundant bird in North America, and possibly the world. Although theory predicts that large populations will be more genetically diverse, passenger pigeon genetic diversity was surprisingly low. To investigate this disconnect, we analyzed 41 mitochondrial and 4 nuclear genomes from passenger pigeons and 2 genomes from band-tailed pigeons, which are passenger pigeons' closest living relatives. Passenger pigeons' large population size appears to have allowed for faster adaptive evolution and removal of harmful mutations, driving a huge loss in their neutral genetic diversity. These results demonstrate the effect that selection can have on a vertebrate genome and contradict results that suggested that population instability contributed to this species's surprisingly rapid extinction

    Identification and mixture deconvolution of ancient and forensic DNA using population genomic data

    No full text
    Forensic scientists routinely use DNA for identification and to match samples with individuals. Although standard approaches are effective on a wide variety of samples in various conditions, issues such as low-template DNA samples and mixtures of DNA from multiple individuals pose significant challenges. Extreme examples of these challenges can be found in the field of ancient DNA, where DNA recovered from ancient remains is highly fragmented and marked by patterns of DNA-damage. Additionally, ancient libraries are often characterized by low endogenous DNA content and contaminating DNA from outside sources. As a result, standard forensics approaches, such as amplification of short-tandem repeats, are not effective on ancient samples. Alternatively, ancient DNA is routinely directly sequenced using high-throughput sequencing to survey the molecules that are present within a library. However, the resulting sequences are not easily compared for the purposes of identification, as each data set represents a random and, in some cases, non-overlapping, sample of the genome. In this dissertation, I present two approaches for interpreting shotgun sequences that address two common issues in forensic and ancient DNA: extremely low nuclear genome coverage and mixtures of sequences from multiple individuals. First, I present an approach to test for a common source individual between extremely low-coverage sequence data sets that makes use of the vast number of single-nucleotide polymorphisms (SNPs) discovered by surveys of human genetic diversity. As almost no observed SNP positions will be common to both samples, our method uses patterns of linkage disequilibrium as modeled by a panel of haplotypes to determine whether observations made across samples are consistent with originating from a single individual. I demonstrate the power of this approach using coalescent simulations, downsampled high-throughput sequencing data and published ancient DNA data. Second, I present an approach for interpreting mixtures of mitochondrial DNA sequences from multiple individuals. Mixed DNA samples are common in forensics investigations, either from the direct nature of a case (e.g., a sample containing DNA from both a victim and a perpetrator) or from outside contamination. I describe an expectation maximization approach for detecting the mitochondrial haplogroups contributing to a mixture and partitioning fragments by haplogroup to reconstruct the underlying haplotypes. I demonstrate the approach’s feasibility, accuracy, and sensitivity on both in silico and in vitro sequence mixtures. Finally, I present the results of applying our mixture interpretation approach on ancient contact DNA recovered from ∼ 700 year old moccasin and cordage samples
    corecore