102 research outputs found

    Patterns of genic intolerance of rare copy number variation in 59,898 human exomes.

    Get PDF
    Copy number variation (CNV) affecting protein-coding genes contributes substantially to human diversity and disease. Here we characterized the rates and properties of rare genic CNVs (<0.5% frequency) in exome sequencing data from nearly 60,000 individuals in the Exome Aggregation Consortium (ExAC) database. On average, individuals possessed 0.81 deleted and 1.75 duplicated genes, and most (70%) carried at least one rare genic CNV. For every gene, we empirically estimated an index of relative intolerance to CNVs that demonstrated moderate correlation with measures of genic constraint based on single-nucleotide variation (SNV) and was independently correlated with measures of evolutionary conservation. For individuals with schizophrenia, genes affected by CNVs were more intolerant than in controls. The ExAC CNV data constitute a critical component of an integrated database spanning the spectrum of human genetic variation, aiding in the interpretation of personal genomes as well as population-based disease studies. These data are freely available for download and visualization online

    Population genomics of domestic and wild yeasts

    Get PDF
    The natural genetics of an organism is determined by the distribution of sequences of its genome. Here we present one- to four-fold, with some deeper, coverage of the genome sequences of over seventy isolates of the domesticated baker&#x27;s yeast, _Saccharomyces cerevisiae_, and its closest relative, the wild _S. paradoxus_, which has never been associated with human activity. These were collected from numerous geographic locations and sources (including wild, clinical, baking, wine, laboratory and food spoilage). These sequences provide an unprecedented view of the population structure, natural (and artificial) selection and genome evolution in these species. Variation in gene content, SNPs, indels, copy numbers and transposable elements provide insights into the evolution of different lineages. Phenotypic variation broadly correlates with global genome-wide phylogenetic relationships however there is no correlation with source. _S. paradoxus_ populations are well delineated along geographic boundaries while the variation among worldwide _S. cerevisiae_ isolates show less differentiation and is comparable to a single _S. paradoxus_ population. Rather than one or two domestication events leading to the extant baker&#x27;s yeasts, the population structure of _S. cerevisiae_ shows a few well defined geographically isolated lineages and many different mosaics of these lineages, supporting the notion that human influence provided the opportunity for outbreeding and production of new combinations of pre-existing variation

    Genetic Architecture of Highly Complex Chemical Resistance Traits across Four Yeast Strains

    Get PDF
    Many questions about the genetic basis of complex traits remain unanswered. This is in part due to the low statistical power of traditional genetic mapping studies. We used a statistically powerful approach, extreme QTL mapping (X-QTL), to identify the genetic basis of resistance to 13 chemicals in all 6 pairwise crosses of four ecologically and genetically diverse yeast strains, and we detected a total of more than 800 loci. We found that the number of loci detected in each experiment was primarily a function of the trait (explaining 46% of the variance) rather than the cross (11%), suggesting that the level of genetic complexity is a consistent property of a trait across different genetic backgrounds. Further, we observed that most loci had trait-specific effects, although a small number of loci with effects in many conditions were identified. We used the patterns of resistance and susceptibility alleles in the four parent strains to make inferences about the allele frequency spectrum of functional variants. We also observed evidence of more complex allelic series at a number of loci, as well as strain-specific signatures of selection. These results improve our understanding of complex traits in yeast and have implications for study design in other organisms

    Using Stochastic Causal Trees to Augment Bayesian Networks for Modeling eQTL Datasets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The combination of genotypic and genome-wide expression data arising from segregating populations offers an unprecedented opportunity to model and dissect complex phenotypes. The immense potential offered by these data derives from the fact that genotypic variation is the sole source of perturbation and can therefore be used to reconcile changes in gene expression programs with the parental genotypes. To date, several methodologies have been developed for modeling eQTL data. These methods generally leverage genotypic data to resolve causal relationships among gene pairs implicated as associates in the expression data. In particular, leading studies have augmented Bayesian networks with genotypic data, providing a powerful framework for learning and modeling causal relationships. While these initial efforts have provided promising results, one major drawback associated with these methods is that they are generally limited to resolving causal orderings for transcripts most proximal to the genomic loci. In this manuscript, we present a probabilistic method capable of learning the causal relationships between transcripts at all levels in the network. We use the information provided by our method as a prior for Bayesian network structure learning, resulting in enhanced performance for gene network reconstruction.</p> <p>Results</p> <p>Using established protocols to synthesize eQTL networks and corresponding data, we show that our method achieves improved performance over existing leading methods. For the goal of gene network reconstruction, our method achieves improvements in recall ranging from 20% to 90% across a broad range of precision levels and for datasets of varying sample sizes. Additionally, we show that the learned networks can be utilized for expression quantitative trait loci mapping, resulting in upwards of 10-fold increases in recall over traditional univariate mapping.</p> <p>Conclusions</p> <p>Using the information from our method as a prior for Bayesian network structure learning yields large improvements in accuracy for the tasks of gene network reconstruction and expression quantitative trait loci mapping. In particular, our method is effective for establishing causal relationships between transcripts located both proximally and distally from genomic loci.</p

    Finding the sources of missing heritability in a yeast cross

    Get PDF
    For many traits, including susceptibility to common diseases in humans, causal loci uncovered by genetic mapping studies explain only a minority of the heritable contribution to trait variation. Multiple explanations for this "missing heritability" have been proposed. Here we use a large cross between two yeast strains to accurately estimate different sources of heritable variation for 46 quantitative traits and to detect underlying loci with high statistical power. We find that the detected loci explain nearly the entire additive contribution to heritable variation for the traits studied. We also show that the contribution to heritability of gene-gene interactions varies among traits, from near zero to 50%. Detected two-locus interactions explain only a minority of this contribution. These results substantially advance our understanding of the missing heritability problem and have important implications for future studies of complex and quantitative traits

    The Evolution of Gene Expression QTL in Saccharomyces cerevisiae

    Get PDF
    Understanding the evolutionary forces that influence patterns of gene expression variation will provide insights into the mechanisms of evolutionary change and the molecular basis of phenotypic diversity. To date, studies of gene expression evolution have primarily been made by analyzing how gene expression levels vary within and between species. However, the fundamental unit of heritable variation in transcript abundance is the underlying regulatory allele, and as a result it is necessary to understand gene expression evolution at the level of DNA sequence variation. Here we describe the evolutionary forces shaping patterns of genetic variation for 1206 cis-regulatory QTL identified in a cross between two divergent strains of Saccharomyces cerevisiae. We demonstrate that purifying selection against mildly deleterious alleles is the dominant force governing cis-regulatory evolution in S. cerevisiae and estimate the strength of selection. We also find that essential genes and genes with larger codon bias are subject to slightly stronger cis-regulatory constraint and that positive selection has played a role in the evolution of major trans-acting QTL

    Incipient Balancing Selection through Adaptive Loss of Aquaporins in Natural Saccharomyces cerevisiae Populations

    Get PDF
    A major goal in evolutionary biology is to understand how adaptive evolution has influenced natural variation, but identifying loci subject to positive selection has been a challenge. Here we present the adaptive loss of a pair of paralogous genes in specific Saccharomyces cerevisiae subpopulations. We mapped natural variation in freeze-thaw tolerance to two water transporters, AQY1 and AQY2, previously implicated in freeze-thaw survival. However, whereas freeze-thaw–tolerant strains harbor functional aquaporin genes, the set of sensitive strains lost aquaporin function at least 6 independent times. Several genomic signatures at AQY1 and/or AQY2 reveal low variation surrounding these loci within strains of the same haplotype, but high variation between strain groups. This is consistent with recent adaptive loss of aquaporins in subgroups of strains, leading to incipient balancing selection. We show that, although aquaporins are critical for surviving freeze-thaw stress, loss of both genes provides a major fitness advantage on high-sugar substrates common to many strains' natural niche. Strikingly, strains with non-functional alleles have also lost the ancestral requirement for aquaporins during spore formation. Thus, the antagonistic effect of aquaporin function—providing an advantage in freeze-thaw tolerance but a fitness defect for growth in high-sugar environments—contributes to the maintenance of both functional and nonfunctional alleles in S. cerevisiae. This work also shows that gene loss through multiple missense and nonsense mutations, hallmarks of pseudogenization presumed to emerge after loss of constraint, can arise through positive selection

    Analysis of protein-coding genetic variation in 60,706 humans

    Get PDF
    Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes

    Evidence For Genetic Heterogeneity Between Clinical Subtypes of Bipolar Disorder

    Get PDF
    We performed a genome-wide association study of 6447 bipolar disorder (BD) cases and 12 639 controls from the International Cohort Collection for Bipolar Disorder (ICCBD). Meta-analysis was performed with prior results from the Psychiatric Genomics Consortium Bipolar Group for a combined sample of 13 902 cases and 19 279 controls. We identified eight genome-wide significant, associated regions, including a novel associated region on chromosome 10 (rs10884920; P = 3.28 × 10 − 8) that includes the brain-enriched cytoskeleton protein adducin 3 (ADD3), a non-coding RNA, and a neuropeptide-specific aminopeptidase P (XPNPEP1). Our large sample size allowed us to test the heritability and genetic correlation of BD subtypes and investigate their genetic overlap with schizophrenia (SCZ) and major depressive disorder. We found a significant difference in heritability of the two most common forms of BD (BD I h2 = 0.35; BD II h2 = 0.25; P = 0.02) with a genetic correlation between BD I and BD II of 0.78,compared with a genetic correlation of 0.97 when BD cohorts containing both types were compared. In addition, we demonstrated a significantly greater load of polygenic risk alleles for SCZ and BD in patients with BD I compared with patients with BD II, and a greater load of SCZ risk alleles in the bipolar type of schizoaffective disorder (SAB) compared with both other BD subtypes. These results point to a partial difference in genetic architecture of BD subtypes, and are suggestive of a molecular correlate for the clinical division of BD into subtypes
    corecore