27 research outputs found

    Genotype-Environment Interactions Reveal Causal Pathways That Mediate Genetic Effects on Phenotype

    Get PDF
    Unraveling the molecular processes that lead from genotype to phenotype is crucial for the understanding and effective treatment of genetic diseases. Knowledge of the causative genetic defect most often does not enable treatment; therefore, causal intermediates between genotype and phenotype constitute valuable candidates for molecular intervention points that can be therapeutically targeted. Mapping genetic determinants of gene expression levels (also known as expression quantitative trait loci or eQTL studies) is frequently used for this purpose, yet distinguishing causation from correlation remains a significant challenge. Here, we address this challenge using extensive, multi-environment gene expression and fitness profiling of hundreds of genetically diverse yeast strains, in order to identify truly causal intermediate genes that condition fitness in a given environment. Using functional genomics assays, we show that the predictive power of eQTL studies for inferring causal intermediate genes is poor unless performed across multiple environments. Surprisingly, although the effects of genotype on fitness depended strongly on environment, causal intermediates could be most reliably predicted from genetic effects on expression present in all environments. Our results indicate a mechanism explaining this apparent paradox, whereby immediate molecular consequences of genetic variation are shared across environments, and environment-dependent phenotypic effects result from downstream integration of environmental signals. We developed a statistical model to predict causal intermediates that leverages this insight, yielding over 400 transcripts, for the majority of which we experimentally validated their role in conditioning fitness. Our findings have implications for the design and analysis of clinical omics studies aimed at discovering personalized targets for molecular intervention, suggesting that inferring causation in a single cellular context can benefit from molecular profiling in multiple contexts

    A privacy-preserving solution for compressed storage and selective retrieval of genomic data

    No full text
    In clinical genomics, the continuous evolution of bioinformatic algorithms and sequencing platforms makes it beneficial to store patients' complete aligned genomic data in addition to variant calls relative to a reference sequence. Due to the large size of human genome sequence data files (varying from 30 GB to 200 GB depending on coverage), two major challenges facing genomics laboratories are the costs of storage and the efficiency of the initial data processing. In addition, privacy of genomic data is becoming an increasingly serious concern, yet no standard data storage solutions exist that enable compression, encryption, and selective retrieval. Here we present a privacy-preserving solution named SECRAM (elective retrieval on Encrypted and Compressed Reference oriented Alignment Map) for the secure storage of compressed aligned genomic data. Our solution enables selective retrieval of encrypted data and improves the efficiency of downstream analysis (e.g., variant calling). Compared with BAM, the de facto standard for storing aligned genomic data, SECRAM uses 18% less storage. Compared with CRAM, one of the most compressed nonencrypted formats (using 34% less storage than BAM), SECRAM maintains efficient compression and downstream data processing, while allowing for unprecedented levels of security in genomic data storage. Compared with previous work, the distinguishing features of SECRAM are that (1) it is position-based instead of read-based, and (2) it allows random querying of a subregion from a BAM-like file in an encrypted form. Our method thus offers a space-saving, privacy-preserving, and effective solution for the storage of clinical genomic data

    Experimental Relocation of the Mitochondrial ATP9 Gene to the Nucleus Reveals Forces Underlying Mitochondrial Genome Evolution

    Get PDF
    Only a few genes remain in the mitochondrial genome retained by every eukaryotic organism that carry out essential functions and are implicated in severe diseases. Experimentally relocating these few genes to the nucleus therefore has both therapeutic and evolutionary implications. Numerous unproductive attempts have been made to do so, with a total of only 5 successes across all organisms. We have taken a novel approach to relocating mitochondrial genes that utilizes naturally nuclear versions from other organisms. We demonstrate this approach on subunit 9/c of ATP synthase, successfully relocating this gene for the first time in any organism by expressing the ATP9 genes from Podospora anserina in Saccharomyces cerevisiae. This study substantiates the role of protein structure in mitochondrial gene transfer: expression of chimeric constructs reveals that the P. anserina proteins can be correctly imported into mitochondria due to reduced hydrophobicity of the first transmembrane segment. Nuclear expression of ATP9, while permitting almost fully functional oxidative phosphorylation, perturbs many cellular properties, including cellular morphology, and activates the heat shock response. Altogether, our study establishes a novel strategy for allotopic expression of mitochondrial genes, demonstrates the complex adaptations required to relocate ATP9, and indicates a reason that this gene was only transferred to the nucleus during the evolution of multicellular organisms

    Mitochondrial protein sorting as a therapeutic target for ATP synthase disorders

    Get PDF
    Mitochondrial diseases are systemic, prevalent and often fatal; yet treatments remain scarce. Identifying molecular intervention points that can be therapeutically targeted remains a major challenge, which we confronted via a screening assay we developed. Using yeast models of mitochondrial ATP synthase disorders, we screened a drug repurposing library, and applied genomic and biochemical techniques to identify pathways of interest. Here we demonstrate that modulating the sorting of nuclear-encoded proteins into mitochondria, mediated by the TIM23 complex, proves therapeutic in both yeast and patient-derived cells exhibiting ATP synthase deficiency. Targeting ​TIM23-dependent protein sorting improves an array of phenotypes associated with ATP synthase disorders, including biogenesis and activity of the oxidative phosphorylation machinery. Our study establishes mitochondrial protein sorting as an intervention point for ATP synthase disorders, and because of the central role of this pathway in mitochondrial biogenesis, it holds broad value for the treatment of mitochondrial diseases

    An evaluation of high-throughput approaches to QTL mapping in Saccharomyces cerevisiae

    No full text
    Dissecting the molecular basis of quantitative traits is a significant challenge and is essential for understanding complex diseases. Even in model organisms, precisely determining causative genes and their interactions has remained elusive, due in part to difficulty in narrowing intervals to single genes and in detecting epistasis or linked quantitative trait loci. These difficulties are exacerbated by limitations in experimental design, such as low numbers of analyzed individuals or of polymorphisms between parental genomes. We address these challenges by applying three independent high-throughput approaches for QTL mapping to map the genetic variants underlying 11 phenotypes in two genetically distant Saccharomyces cerevisiae strains, namely (1) individual analysis of >700 meiotic segregants, (2) bulk segregant analysis, and (3) reciprocal hemizygosity scanning, a new genome-wide method that we developed. We reveal differences in the performance of each approach and, by combining them, identify eight polymorphic genes that affect eight different phenotypes: colony shape, flocculation, growth on two nonfermentable carbon sources, and resistance to two drugs, salt, and high temperature. Our results demonstrate the power of individual segregant analysis to dissect QTL and address the underestimated contribution of interactions between variants. We also reveal confounding factors like mutations and aneuploidy in pooled approaches, providing valuable lessons for future designs of complex trait mapping studies

    Genetic architecture of growth rate and gene expression in multiple environments.

    No full text
    <p>(<b>a</b>) Genetic associations with growth (growth QTLs) in 26 environmental conditions. The significance of association (<i>P</i>-value, single-marker analysis, Methods) is shown for each of 13,314 markers along the genome (x-axis) with growth rates in 26 environments (y-axis). The direction of the QTL effect is color-coded, where red indicates that the clinical isolate (Y) allele is associated with increased growth rate and blue the lab strain (S) allele; darker colors indicate greater significance. Two examples of markers significantly associated with growth in only a small number of environments (<i>MAL13</i> and <i>SUC8</i>, black rectangles), and two showing significant effects in opposing directions depending on environment (<i>HAP1</i> and <i>MKT1</i>, dotted black rectangles) are highlighted. (<b>b</b>) Genetic associations with gene expression (eQTLs) in 5 selected environments. Each panel shows the number of genes associated with the underlying regions in a sliding window analysis for each environment (FDR<0.05, 50 kb window). The association strength of growth from <b>a</b>) is displayed in the color bars below each panel. Six significant genetic loci were identified that jointly regulate growth in these environments (<i>AMN1</i>, <i>CHRV</i>, <i>MAL13</i>, <i>CHRX</i>, <i>HAP1</i>, <i>MKT1</i>, multi-environment growth genetic model, Methods). These are labeled in bold for every environment in which they were associated with growth in (a) (growth QTL).</p

    eQTLs that persist across environments are effective predictors of causal intermediates.

    No full text
    <p>(<b>a</b>) Validation rate (relative to a random selection of genes, Methods) for the top 100 genes whose expression was significantly associated with a growth QTL that is environment-dependent (blue) or persistent (orange), based on genome-wide deletion assays. Error bars indicate plus or minus one standard deviation (jackknife resampling of the growth QTLs, Methods). Star indicates significance of difference (<i>P</i><0.002, two-sided paired Wilcoxon rank sum test). (<b>b</b>) <i>MRP51</i>: example of a candidate causal intermediate predicted by the Bayesian network to mediate the effect of the <i>MKT1</i> genotype on growth in ethanol. In each of the 5 environments (panels), growth rate (y-axis) is plotted vs. <i>MRP51</i> expression level (x-axis) and <i>MKT1</i> genotype is indicated (clinical isolate allele Y in red, laboratory strain allele S in blue) for all profiled segregants. <i>MRP51</i> constitutes a strong candidate causal intermediate because: 1) <i>MRP51</i> expression is persistently associated with the <i>MKT1</i> genotype, in every environment (vertical bars in each panel mark the midpoint between the expression mean of the two subpopulations); and 2) <i>MRP51</i> expression correlates with growth in the ethanol environment (trendline based on linear regression, see also <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003803#pgen.1003803.s012" target="_blank">Fig. S12</a>). (<b>c</b>) Number of predicted causal intermediate genes validated by deletion (y-axis) vs. number predicted, sorted by prediction confidence (x-axis) for Bayesian network of persistent intermediate genes (purple), persistent eQTL associations (orange), Bayesian network based on single environments (pink), environment-dependent eQTL associations (blue), and random selection (black dashed line). Validations were based on genome-wide deletion phenotypes (Methods).</p

    Molecular basis of the <i>MKT1</i> genotype's opposing effects on growth.

    No full text
    <p>(<b>a</b>) Distribution of growth rates according to <i>MKT1</i> genotype (laboratory strain allele (S) in blue, clinical isolate allele (Y) in red) in two environments: ethanol (left panel) and rapamycin (right panel). The association is significant for both environments (<i>P</i><6×10<sup>−10</sup> and <i>P</i><0.05 respectively, two-sided Wilcoxon rank sum test) but the alleles have opposing effects on growth (S detrimental in ethanol, Y detrimental in rapamycin). (<b>b</b>) Association of <i>MKT1</i> genotype with gene expression (<i>P</i>-value, right and top display higher expression values associated with the clinical strain allele, left and bottom display higher expression levels associated with the laboratory strain allele) in ethanol (x-axis) and rapamycin (y-axis) for all genes. Unlike for growth, the overall effects of <i>MKT1</i> genotype on gene expression levels are in the same direction (positive correlation; Wilcoxon rank-sum test P<2.2×10<sup>−16</sup>). High-ranking candidate causal intermediate genes according to the Bayesian network and common to both environments (89 common genes from the top 100 in each environment) are highlighted in purple. (<b>c</b>) Fitness defects induced by gene deletion (selection coefficient from deletion collection assay, Methods) in ethanol (x-axis) versus rapamycin (y-axis), color-coded as in b). Left and upper panels show the distribution of the selection coefficient for the deletion of the candidate genes (purple) and all other genes (grey). Candidate genes (purple) are typically beneficial for growth in ethanol and detrimental in rapamycin. (<b>d</b>) Model of the genotype-environment interaction that explains the <i>MKT1</i> genotype's opposing effects on growth. The <i>MKT1</i> clinical isolate allele upregulates expression of several mitochondrial genes (<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003803#pgen.1003803.s017" target="_blank">Table S4</a>) regardless of environment; this regulation leads to improved growth rates in ethanol, but repressed growth in the presence of rapamycin.</p
    corecore