110 research outputs found

    Which mouse multiparental population is right for your study? The Collaborative Cross inbred strains, their F1 hybrids, or the Diversity Outbred population.

    Get PDF
    Multiparental populations (MPPs) encompass greater genetic diversity than traditional experimental crosses of two inbred strains, enabling broader surveys of genetic variation underlying complex traits. Two such mouse MPPs are the Collaborative Cross (CC) inbred panel and the Diversity Outbred (DO) population, which are descended from the same eight inbred strains. Additionally, the F1 intercrosses of CC strains (CC-RIX) have been used and enable study designs with replicate outbred mice. Genetic analyses commonly used by researchers to investigate complex traits in these populations include characterizing how heritable a trait is, i.e. its heritability, and mapping its underlying genetic loci, i.e. its quantitative trait loci (QTLs). Here we evaluate the relative merits of these populations for these tasks through simulation, as well as provide recommendations for performing the quantitative genetic analyses. We find that sample populations that include replicate animals, as possible with the CC and CC-RIX, provide more efficient and precise estimates of heritability. We report QTL mapping power curves for the CC, CC-RIX, and DO across a range of QTL effect sizes and polygenic backgrounds for samples of 174 and 500 mice. The utility of replicate animals in the CC and CC-RIX for mapping QTLs rapidly decreased as traits became more polygenic. Only large sample populations of 500 DO mice were well-powered to detect smaller effect loci (7.5-10%) for highly complex traits (80% polygenic background). All results were generated with our R package musppr, which we developed to simulate data from these MPPs and evaluate genetic analyses from user-provided genotypes

    The trouble with triples: Examining the impact of measurement error in mediation analysis.

    Get PDF
    Mediation analysis is used in genetic mapping studies to identify candidate gene mediators of quantitative trait loci (QTL). We consider genetic mediation analysis of triplets-sets of three variables consisting of a target trait, the genotype at a QTL for the target trait, and a candidate mediator that is the abundance of a transcript or protein whose coding gene co-locates with the QTL. We show that, in the presence of measurement error, mediation analysis can infer partial mediation even in the absence of a causal relationship between the candidate mediator and the target. We describe a measurement error model and a corresponding latent variable model with estimable parameters that are combinations of the causal effects and measurement errors across all three variables. The relative magnitudes of the latent variable correlations determine whether or not mediation analysis will tend to infer the correct causal relationship in large samples. We examine case studies that illustrate the common failure modes of genetic mediation analysis and demonstrate how to evaluate the effects of measurement error. While genetic mediation analysis is a powerful tool for identifying candidate genes, we recommend caution when interpreting mediation analysis findings

    A Bayesian model selection approach to mediation analysis.

    Get PDF
    Genetic studies often seek to establish a causal chain of events originating from genetic variation through to molecular and clinical phenotypes. When multiple phenotypes share a common genetic association, one phenotype may act as an intermediate for the genetic effects on the other. Alternatively, the phenotypes may be causally unrelated but share genetic loci. Mediation analysis represents a class of causal inference approaches used to determine which of these scenarios is most plausible. We have developed a general approach to mediation analysis based on Bayesian model selection and have implemented it in an R package, bmediatR. Bayesian model selection provides a flexible framework that can be tailored to different analyses. Our approach can incorporate prior information about the likelihood of models and the strength of causal effects. It can also accommodate multiple genetic variants or multi-state haplotypes. Our approach reports posterior probabilities that can be useful in interpreting uncertainty among competing models. We compared bmediatR with other popular methods, including the Sobel test, Mendelian randomization, and Bayesian network analysis using simulated data. We found that bmediatR performed as well or better than these alternatives in most scenarios. We applied bmediatR to proteome data from Diversity Outbred (DO) mice, a multi-parent population, and demonstrate the power of mediation with multi-state haplotypes. We also applied bmediatR to data from human cell lines to identify transcripts that are mediated through or are expressed independently from local chromatin accessibility. We demonstrate that Bayesian model selection provides a powerful and versatile approach to identify causal relationships in genetic studies using model organism or human data

    Genome-wide transcript and protein analysis highlights the role of protein homeostasis in the aging mouse heart.

    Get PDF
    Investigation of the molecular mechanisms of aging in the human heart is challenging because of confounding factors, such as diet and medications, as well as limited access to tissues from healthy aging individuals. The laboratory mouse provides an ideal model to study aging in healthy individuals in a controlled environment. However, previous mouse studies have examined only a narrow range of the genetic variation that shapes individual differences during aging. Here, we analyze transcriptome and proteome data from 185 genetically diverse male and female mice at ages 6, 12, and 18 mo to characterize molecular changes that occur in the aging heart. Transcripts and proteins reveal activation of pathways related to exocytosis and cellular transport with age, whereas processes involved in protein folding decrease with age. Additional changes are apparent only in the protein data including reduced fatty acid oxidation and increased autophagy. For proteins that form complexes, we see a decline in correlation between their component subunits with age, suggesting age-related loss of stoichiometry. The most affected complexes are themselves involved in protein homeostasis, which potentially contributes to a cycle of progressive breakdown in protein quality control with age. Our findings highlight the important role of post-transcriptional regulation in aging. In addition, we identify genetic loci that modulate age-related changes in protein homeostasis, suggesting that genetic variation can alter the molecular aging process

    A Bayesian model selection approach to mediation analysis

    Get PDF
    Genetic studies often seek to establish a causal chain of events originating from genetic variation through to molecular and clinical phenotypes. When multiple phenotypes share a common genetic association, one phenotype may act as an intermediate for the genetic effects on the other. Alternatively, the phenotypes may be causally unrelated but share genetic loci. Mediation analysis represents a class of causal inference approaches used to determine which of these scenarios is most plausible. We have developed a general approach to mediation analysis based on Bayesian model selection and have implemented it in an R package, bmediatR. Bayesian model selection provides a flexible framework that can be tailored to different analyses. Our approach can incorporate prior information about the likelihood of models and the strength of causal effects. It can also accommodate multiple genetic variants or multi-state haplotypes. Our approach reports posterior probabilities that can be useful in interpreting uncertainty among competing models. We compared bmediatR with other popular methods, including the Sobel test, Mendelian randomization, and Bayesian network analysis using simulated data. We found that bmediatR performed as well or better than these alternatives in most scenarios. We applied bmediatR to proteome data from Diversity Outbred (DO) mice, a multi-parent population, and demonstrate the power of mediation with multi-state haplotypes. We also applied bmediatR to data from human cell lines to identify transcripts that are mediated through or are expressed independently from local chromatin accessibility. We demonstrate that Bayesian model selection provides a powerful and versatile approach to identify causal relationships in genetic studies using model organism or human data

    Candidate Risk Factors and Mechanisms for Tolvaptan-Induced Liver Injury Are Identified Using a Collaborative Cross Approach

    Get PDF
    Clinical trials of tolvaptan showed it to be a promising candidate for the treatment of Autosomal Dominant Polycystic Kidney Disease (ADPKD) but also revealed potential for idiosyncratic drug-induced liver injury (DILI) in this patient population. To identify risk factors and mechanisms underlying tolvaptan DILI, 8 mice in each of 45 strains of the genetically diverse Collaborative Cross (CC) mouse population were treated with a single oral dose of either tolvaptan or vehicle. Significant elevations in plasma alanine aminotransferase (ALT) were observed in tolvaptan-treated animals in 3 of the 45 strains. Genetic mapping coupled with transcriptomic analysis in the liver was used to identify several candidate susceptibility genes including epoxide hydrolase 2, interferon regulatory factor 3, and mitochondrial fission factor. Gene pathway analysis revealed that oxidative stress and immune response pathways were activated in response to tolvaptan treatment across all strains, but genes involved in regulation of bile acid homeostasis were most associated with tolvaptan-induced elevations in ALT. Secretory leukocyte peptidase inhibitor (Slpi) mRNA was also induced in the susceptible strains and was associated with increased plasma levels of Slpi protein, suggesting a potential serum marker for DILI susceptibility. In summary, tolvaptan induced signs of oxidative stress, mitochondrial dysfunction, and innate immune response in all strains, but variation in bile acid homeostasis was most associated with susceptibility to the liver response. This CC study has indicated potential mechanisms underlying tolvaptan DILI and biomarkers of susceptibility that may be useful in managing the risk of DILI in ADPKD patients

    Genetic dissection of the pluripotent proteome through multi-omics data integration.

    Get PDF
    Genetic background drives phenotypic variability in pluripotent stem cells (PSCs). Most studies to date have used transcript abundance as the primary molecular readout of cell state in PSCs. We performed a comprehensive proteogenomics analysis of 190 genetically diverse mouse embryonic stem cell (mESC) lines. The quantitative proteome is highly variable across lines, and we identified pluripotency-associated pathways that were differentially activated in the proteomics data that were not evident in transcriptome data from the same lines. Integration of protein abundance to transcript levels and chromatin accessibility revealed broad co-variation across molecular layers as well as shared and unique drivers of quantitative variation in pluripotency-associated pathways. Quantitative trait locus (QTL) mapping localized the drivers of these multi-omic signatures to genomic hotspots. This study reveals post-transcriptional mechanisms and genetic interactions that underlie quantitative variability in the pluripotent proteome and provides a regulatory map for mESCs that can provide a basis for future mechanistic studies

    Diagnostics and correction of batch effects in large-scale proteomic studies: a tutorial.

    Get PDF
    Advancements in mass spectrometry-based proteomics have enabled experiments encompassing hundreds of samples. While these large sample sets deliver much-needed statistical power, handling them introduces technical variability known as batch effects. Here, we present a step-by-step protocol for the assessment, normalization, and batch correction of proteomic data. We review established methodologies from related fields and describe solutions specific to proteomic challenges, such as ion intensity drift and missing values in quantitative feature matrices. Finally, we compile a set of techniques that enable control of batch effect adjustment quality. We provide an R package, proBatch , containing functions required for each step of the protocol. We demonstrate the utility of this methodology on five proteomic datasets each encompassing hundreds of samples and consisting of multiple experimental designs. In conclusion, we provide guidelines and tools to make the extraction of true biological signal from large proteomic studies more robust and transparent, ultimately facilitating reliable and reproducible research in clinical proteomics and systems biology

    Assessing the Cumulative Contribution of New and Established Common Genetic Risk Factors to Early-Onset Prostate Cancer

    Get PDF
    We assessed the evidence for association between 23 recently reported prostate cancer (PCa) variants and early-onset PCa and the aggregate value of 63 PCa variants for predicting early-onset disease using 931 unrelated men diagnosed with PCa prior to age 56 years and 1126 male controls

    Genetic Mapping of Multiple Metabolic Traits Identifies Novel Genes for Adiposity, Lipids and Insulin Secretory Capacity in Outbred Rats

    Get PDF
    Despite the successes of human genome-wide association studies, the causal genes underlying most metabolic traits remain unclear. We used outbred heterogeneous stock (HS) rats, coupled with expression data and mediation analysis, to identify quantitative trait loci (QTLs) and candidate gene mediators for adiposity, glucose tolerance, serum lipids, and other metabolic traits. Physiological traits were measured in 1519 male HS rats, with liver and adipose transcriptomes measured in over 410 rats. Genotypes were imputed from low coverage whole genome sequence. Linear mixed models were used to detect physiological and expression QTLs (pQTLs and eQTLs, respectively), employing both SNP- and haplotype-based models for pQTL mapping. Genes with cis-eQTLs that overlapped pQTLs were assessed as causal candidates through mediation analysis. We identified 14 SNP-based pQTLs and 19 haplotype-based pQTLs, of which 10 were in common. Using mediation, we identified the following genes as candidate mediators of pQTLs: Grk5 for a fat pad weight pQTL on Chr1, Krtcap3 for fat pad weight and serum lipids pQTLs on Chr6, Ilrun for a fat pad weight pQTL on Chr20 and Rfx6 for a whole pancreatic insulin content pQTL on Chr20. Furthermore, we verified Grk5 and Ktrcap3 using gene knock-down/out models, thereby shedding light on novel regulators of obesity
    • …
    corecore