19 research outputs found

    Mitochondrial genome diversity across the subphylum Saccharomycotina

    Get PDF
    IntroductionEukaryotic life depends on the functional elements encoded by both the nuclear genome and organellar genomes, such as those contained within the mitochondria. The content, size, and structure of the mitochondrial genome varies across organisms with potentially large implications for phenotypic variance and resulting evolutionary trajectories. Among yeasts in the subphylum Saccharomycotina, extensive differences have been observed in various species relative to the model yeast Saccharomyces cerevisiae, but mitochondrial genome sampling across many groups has been scarce, even as hundreds of nuclear genomes have become available.MethodsBy extracting mitochondrial assemblies from existing short-read genome sequence datasets, we have greatly expanded both the number of available genomes and the coverage across sparsely sampled clades.ResultsComparison of 353 yeast mitochondrial genomes revealed that, while size and GC content were fairly consistent across species, those in the genera Metschnikowia and Saccharomyces trended larger, while several species in the order Saccharomycetales, which includes S. cerevisiae, exhibited lower GC content. Extreme examples for both size and GC content were scattered throughout the subphylum. All mitochondrial genomes shared a core set of protein-coding genes for Complexes III, IV, and V, but they varied in the presence or absence of mitochondrially-encoded canonical Complex I genes. We traced the loss of Complex I genes to a major event in the ancestor of the orders Saccharomycetales and Saccharomycodales, but we also observed several independent losses in the orders Phaffomycetales, Pichiales, and Dipodascales. In contrast to prior hypotheses based on smaller-scale datasets, comparison of evolutionary rates in protein-coding genes showed no bias towards elevated rates among aerobically fermenting (Crabtree/Warburg-positive) yeasts. Mitochondrial introns were widely distributed, but they were highly enriched in some groups. The majority of mitochondrial introns were poorly conserved within groups, but several were shared within groups, between groups, and even across taxonomic orders, which is consistent with horizontal gene transfer, likely involving homing endonucleases acting as selfish elements.DiscussionAs the number of available fungal nuclear genomes continues to expand, the methods described here to retrieve mitochondrial genome sequences from these datasets will prove invaluable to ensuring that studies of fungal mitochondrial genomes keep pace with their nuclear counterparts

    Amplicon sequencing of 42 nuclear loci supports directional gene flow between South Pacific populations of a hydrothermal vent limpet

    Get PDF
    In the past few decades, population genetics and phylogeographic studies have improved our knowledge of connectivity and population demography in marine environments. Studies of deep‐sea hydrothermal vent populations have identified barriers to gene flow, hybrid zones, and demographic events, such as historical population expansions and contractions. These deep‐sea studies, however, used few loci, which limit the amount of information they provided for coalescent analysis and thus our ability to confidently test complex population dynamics scenarios. In this study, we investigated population structure, demographic history, and gene flow directionality among four Western Pacific hydrothermal vent populations of the vent limpet Lepetodrilus aff. schrolli. These vent sites are located in the Manus and Lau back‐arc basins, currently of great interest for deep‐sea mineral extraction. A total of 42 loci were sequenced from each individual using high‐throughput amplicon sequencing. Amplicon sequences were analyzed using both genetic variant clustering methods and evolutionary coalescent approaches. Like most previously investigated vent species in the South Pacific, L. aff. schrolli showed no genetic structure within basins but significant differentiation between basins. We inferred significant directional gene flow from Manus Basin to Lau Basin, with low to no gene flow in the opposite direction. This study is one of the very few marine population studies using >10 loci for coalescent analysis and serves as a guide for future marine population studies

    Extensive loss of cell-cycle and DNA repair genes in an ancient lineage of bipolar budding yeasts

    Get PDF
    Cell-cycle checkpoints and DNA repair processes protect organisms from potentially lethal mutational damage. Compared to other budding yeasts in the subphylum Saccharomycotina, we noticed that a lineage in the genus Hanseniaspora exhibited very high evolutionary rates, low Guanine–Cytosine (GC) content, small genome sizes, and lower gene numbers. To better understand Hanseniaspora evolution, we analyzed 25 genomes, including 11 newly sequenced, representing 18/21 known species in the genus. Our phylogenomic analyses identify two Hanseniaspora lineages, a faster-evolving lineage (FEL), which began diversifying approximately 87 million years ago (mya), and a slower-evolving lineage (SEL), which began diversifying approximately 54 mya. Remarkably, both lineages lost genes associated with the cell cycle and genome integrity, but these losses were greater in the FEL. E.g., all species lost the cell-cycle regulator WHIskey 5 (WHI5), and the FEL lost components of the spindle checkpoint pathway (e.g., Mitotic Arrest-Deficient 1 [MAD1], Mitotic Arrest-Deficient 2 [MAD2]) and DNA-damage–checkpoint pathway (e.g., Mitosis Entry Checkpoint 3 [MEC3], RADiation sensitive 9 [RAD9]). Similarly, both lineages lost genes involved in DNA repair pathways, including the DNA glycosylase gene 3-MethylAdenine DNA Glycosylase 1 (MAG1), which is part of the base-excision repair pathway, and the DNA photolyase gene PHotoreactivation Repair deficient 1 (PHR1), which is involved in pyrimidine dimer repair. Strikingly, the FEL lost 33 additional genes, including polymerases (i.e., POLymerase 4 [POL4] and POL32) and telomere-associated genes (e.g., Repressor/ activator site binding protein-Interacting Factor 1 [RIF1], Replication Factor A 3 [RFA3], Cell Division Cycle 13 [CDC13], Pbp1p Binding Protein [PBP2]). Echoing these losses, molecular evolutionary analyses reveal that, compared to the SEL, the FEL stem lineage underwent a burst of accelerated evolution, which resulted in greater mutational loads, homopolymer instabilities, and higher fractions of mutations associated with the common endogenously damaged base, 8-oxoguanine. We conclude that Hanseniaspora is an ancient lineage that has diversified and thrived, despite lacking many otherwise highly conserved cell-cycle and genome integrity genes and pathways, and may represent a novel, to our knowledge, system for studying cellular life without them.Fil: Steenwyk, Jacob L.. Vanderbilt University; Estados UnidosFil: Opulente, Dana A.. University of Wisconsin; Estados UnidosFil: Kominek, Jacek. University of Wisconsin; Estados UnidosFil: Shen, Xing-Xing. Vanderbilt University; Estados UnidosFil: Zhou, Xiaofan. South China Agricultural University; ChinaFil: Labella, Abigail L.. Vanderbilt University; Estados UnidosFil: Bradley, Noah P.. Vanderbilt University; Estados UnidosFil: Eichman, Brandt F.. Vanderbilt University; Estados UnidosFil: Cadez, Neza. University of Ljubljana; EsloveniaFil: Libkind Frati, Diego. Universidad Nacional del Comahue. Centro Regional Universitario Bariloche; ArgentinaFil: DeVirgilio, Jeremy. United States Department of Agriculture. Agricultural Research Service; ArgentinaFil: Hulfachor, Amanda Beth. University of Wisconsin; Estados UnidosFil: Kurtzman, Cletus P.. United States Department of Agriculture. Agricultural Research Service; ArgentinaFil: Hittinger, Chris Todd. University of Wisconsin; Estados UnidosFil: Rokas, Antonis. Vanderbilt University; Estados Unido

    Mosaic patterns of selection in genomic regions associated with diverse human traits.

    No full text
    Natural selection shapes the genetic architecture of many human traits. However, the prevalence of different modes of selection on genomic regions associated with variation in traits remains poorly understood. To address this, we developed an efficient computational framework to calculate positive and negative enrichment of different evolutionary measures among regions associated with complex traits. We applied the framework to summary statistics from >900 genome-wide association studies (GWASs) and 11 evolutionary measures of sequence constraint, population differentiation, and allele age while accounting for linkage disequilibrium, allele frequency, and other potential confounders. We demonstrate that this framework yields consistent results across GWASs with variable sample sizes, numbers of trait-associated SNPs, and analytical approaches. The resulting evolutionary atlas maps diverse signatures of selection on genomic regions associated with complex human traits on an unprecedented scale. We detected positive enrichment for sequence conservation among trait-associated regions for the majority of traits (>77% of 290 high power GWASs), which included reproductive traits. Many traits also exhibited substantial positive enrichment for population differentiation, especially among hair, skin, and pigmentation traits. In contrast, we detected widespread negative enrichment for signatures of balancing selection (51% of GWASs) and absence of enrichment for evolutionary signals in regions associated with late-onset Alzheimer's disease. These results support a pervasive role for negative selection on regions of the human genome that contribute to variation in complex traits, but also demonstrate that diverse modes of evolution are likely to have shaped trait-associated loci. This atlas of evolutionary signatures across the diversity of available GWASs will enable exploration of the relationship between the genetic architecture and evolutionary processes in the human genome

    Variation and selection on codon usage bias across an entire subphylum.

    No full text
    Variation in synonymous codon usage is abundant across multiple levels of organization: between codons of an amino acid, between genes in a genome, and between genomes of different species. It is now well understood that variation in synonymous codon usage is influenced by mutational bias coupled with both natural selection for translational efficiency and genetic drift, but how these processes shape patterns of codon usage bias across entire lineages remains unexplored. To address this question, we used a rich genomic data set of 327 species that covers nearly one third of the known biodiversity of the budding yeast subphylum Saccharomycotina. We found that, while genome-wide relative synonymous codon usage (RSCU) for all codons was highly correlated with the GC content of the third codon position (GC3), the usage of codons for the amino acids proline, arginine, and glycine was inconsistent with the neutral expectation where mutational bias coupled with genetic drift drive codon usage. Examination between genes' effective numbers of codons and their GC3 contents in individual genomes revealed that nearly a quarter of genes (381,174/1,683,203; 23%), as well as most genomes (308/327; 94%), significantly deviate from the neutral expectation. Finally, by evaluating the imprint of translational selection on codon usage, measured as the degree to which genes' adaptiveness to the tRNA pool were correlated with selective pressure, we show that translational selection is widespread in budding yeast genomes (264/327; 81%). These results suggest that the contribution of translational selection and drift to patterns of synonymous codon usage across budding yeasts varies across codons, genes, and genomes; whereas drift is the primary driver of global codon usage across the subphylum, the codon bias of large numbers of genes in the majority of genomes is influenced by translational selection

    Mosaic evolutionary architecture across 47 well-powered GWASs of human complex traits.

    No full text
    From our evolutionary atlas of 972 GWASs, we plot a subset of 47 GWASs (BOLT-LMM set) performed using the same approach and from the same cohort (Methods). (a) For each evolutionary measure (columns) and a given trait (row), we calculated the trait-averaged value (x-axis, stars) and compared it with the matched genomic background distribution (gray dots: mean values, gray bars: 5th, 95th percentiles). Traits are manually grouped based on type and similarity. The number of trait-associated regions is provided in parentheses. Red stars (FDRMethods). Results are shown for six evolutionary measures; see S3 Fig for all 11 evolutionary measures. (b) We calculated enrichment as described in Fig 1D and highlight four traits with distinct evolutionary profiles. Spokes represent different evolutionary measures (colored by type of associated force) and concentric rings represent levels of evolutionary enrichment. Red dashed circles represent the expected values (i.e., no enrichment).</p

    File_S1.xlsx

    No full text
    This excel file contains PMID or web link and the source for each GWAS summary statistics analyzed in this study. (XLSX)</p

    Computational framework for detecting positive and negative enrichment for evolutionary signatures on genome-wide association studies (GWASs).

    No full text
    (a) Given the GWAS of a complex trait, we define trait-associated regions by first identifying variants of genome-wide significance and then clumping based on linkage disequilibrium (LD; e.g., r2>0.9). For each region, we identify the maximum value of an evolutionary measure of interest. (b) For each trait-associated region, we identify 5,000 randomly selected genomic regions (“matched regions”) that have similar minor allele frequency, linkage disequilibrium, and gene proximity patterns (Methods). (c) Across the trait-associated regions and their matched random genomic regions, we calculate a summary statistic. To illustrate our approach, we take the mean of the evolutionary measure to generate an (d) empirical background distribution and (e) calculate enrichment by comparing the mean observed evolutionary measure to the mean of the matched background distribution. We divide by the standard deviation of the evolutionary measure across the genome to standardize the enrichment. However, any summary statistic of interest could be used.</p

    The enrichment for evolutionary signatures is consistent across multiple GWASs of the same trait.

    No full text
    (a) For four separate GWASs of height (y-axis), we compared the mean trait-associated values (stars) for multiple evolutionary measures (x-axis) with their corresponding matched genomic background mean values (gray dot: mean value, gray bar: 5th, 95th percentile). We calculated an empirical p-value by comparing to the matched background (Methods) and adjusted for multiple testing (FDR-adjusted p-values Fig 1D. See Table 2 and Methods for details on the four GWASs analyzed.</p

    Enrichment for evolutionary measures is not sensitive to different matching parameters for gene distance and density.

    No full text
    For each evolutionary measure (one plot per measure), we repeated the analysis in Fig 2B and calculated the enrichment (y-axis) across trait-associated regions partitioned by association effect size (x-axis, ordered from negative to positive effect size) for the original Fig 2B analysis (red X) and four other conditions. We repeated the analysis by changing either the gene distance matching threshold to be either +/- 25% (dark blue) or +/-10% (light blue) or the gene density matching threshold to be either +/- 25% (dark green) or +/-10% (light green) while keeping the all other parameters the same. The patterns are similar for nearly all settings. (TIF)</p
    corecore