19 research outputs found

    Bayesian Inference of the Allelic Series in Multiparental Populations with Applications

    Get PDF
    Multiparental populations (MPPs) are experimental populations in which the genome of every individual is a random mosaic of known founder haplotypes. These populations provide advantages for detecting quantitative trait loci (QTL) because tests of association between phenotypes and genetic variation can leverage inferred founder haplotype descent. It is difficult, however, to determine how haplotypes at a locus group into distinct functional alleles, termed the allelic series. The allelic series is important because it provides information about the number of casual variants at a QTL and their combined effects. We begin by analyzing QTL mapping power in a particular MPP, the Collaborative Cross (CC). We find that QTL mapping power depends on the allelic series and whether it is balanced or imbalanced with respect to the founder haplotypes. More generally, this study serves as a much-needed resource for designing CC experiments that are well-powered to detect QTL using haplotype-based approaches. Next, we introduce a fully-Bayesian framework for inferring the allelic series. This framework accounts for sources of uncertainty found in typical MPPs, including individual haplotype states at the QTL, the size of the allele effects, and most importantly, the number and composition of functional alleles. Our prior distribution for the allelic series is based on the Chinese restaurant process, and we leverage its connection to the coalescent to introduce additional prior information about haplotype relatedness via a phylogenetic tree. This is the primary innovation of our research. We evaluate our approach via simulation and find that posterior inference of the allelic series is uncertain even when power is high. Despite this uncertainty, allele-based inference still improves effect estimation when the true number of functional alleles is small. Phylogenetic information improves posterior certainty of the allelic series, effect estimation, and statistical signal. We find only marginal improvements in QTL mapping power using the allele-based approach without tree information, and although the tree-informed approach may perform better, implementing it in practice is challenging. We also apply our method to real data from the CC and the Drosophila Synthetic Population Resource, highlighting new insights facilitated by our allele-based association approach.Doctor of Philosoph

    A Bayesian model selection approach to mediation analysis.

    Get PDF
    Genetic studies often seek to establish a causal chain of events originating from genetic variation through to molecular and clinical phenotypes. When multiple phenotypes share a common genetic association, one phenotype may act as an intermediate for the genetic effects on the other. Alternatively, the phenotypes may be causally unrelated but share genetic loci. Mediation analysis represents a class of causal inference approaches used to determine which of these scenarios is most plausible. We have developed a general approach to mediation analysis based on Bayesian model selection and have implemented it in an R package, bmediatR. Bayesian model selection provides a flexible framework that can be tailored to different analyses. Our approach can incorporate prior information about the likelihood of models and the strength of causal effects. It can also accommodate multiple genetic variants or multi-state haplotypes. Our approach reports posterior probabilities that can be useful in interpreting uncertainty among competing models. We compared bmediatR with other popular methods, including the Sobel test, Mendelian randomization, and Bayesian network analysis using simulated data. We found that bmediatR performed as well or better than these alternatives in most scenarios. We applied bmediatR to proteome data from Diversity Outbred (DO) mice, a multi-parent population, and demonstrate the power of mediation with multi-state haplotypes. We also applied bmediatR to data from human cell lines to identify transcripts that are mediated through or are expressed independently from local chromatin accessibility. We demonstrate that Bayesian model selection provides a powerful and versatile approach to identify causal relationships in genetic studies using model organism or human data

    Economies of scale in federally-funded state-organized public health programs: results from the National Breast and Cervical Cancer Early Detection Programs

    Get PDF
    This study investigates the existence of economies of scale in the provision of breast and cervical cancer screening and diagnostic services by state National Breast and Cervical Cancer Early Detection Program (NBCCEDP) grantees. A translog cost function is estimated as a system with input factor share equations. The estimated cost function is then used to determine output levels for which average costs are decreasing (i.e., economies of scale exist). Data were collected from all state NBCCEDP programs and District of Columbia for program years 2006–2007, 2008–2009 and 2009–2010 (N =147). Costs included all programmatic and in-kind contributions from federal and non-federal sources, allocated to breast and cervical cancer screening activities. Output was measured by women served, women screened and cancers detected, separately by breast and cervical services for each measure. Inputs included labor, rent and utilities, clinical services, and quasi-fixed factors (e.g., percent of women eligible for screening by the NBCCEDP). 144 out of 147 program-years demonstrated significant economies of scale for women served and women screened; 136 out of 145 program-years displayed significant economies of scale for cancers detected. The cost data were self-reported by the NBCCEDP State programs. Quasi-fixed inputs were allowed to affect costs but not economies of scale or the share equations. The main analysis accounted for clustering of observations within State programs, but it did not make full use of the panel data. The average cost of providing breast and cervical cancer screening services decreases as the number of women screened and served increases

    A Bayesian model selection approach to mediation analysis

    Get PDF
    Genetic studies often seek to establish a causal chain of events originating from genetic variation through to molecular and clinical phenotypes. When multiple phenotypes share a common genetic association, one phenotype may act as an intermediate for the genetic effects on the other. Alternatively, the phenotypes may be causally unrelated but share genetic loci. Mediation analysis represents a class of causal inference approaches used to determine which of these scenarios is most plausible. We have developed a general approach to mediation analysis based on Bayesian model selection and have implemented it in an R package, bmediatR. Bayesian model selection provides a flexible framework that can be tailored to different analyses. Our approach can incorporate prior information about the likelihood of models and the strength of causal effects. It can also accommodate multiple genetic variants or multi-state haplotypes. Our approach reports posterior probabilities that can be useful in interpreting uncertainty among competing models. We compared bmediatR with other popular methods, including the Sobel test, Mendelian randomization, and Bayesian network analysis using simulated data. We found that bmediatR performed as well or better than these alternatives in most scenarios. We applied bmediatR to proteome data from Diversity Outbred (DO) mice, a multi-parent population, and demonstrate the power of mediation with multi-state haplotypes. We also applied bmediatR to data from human cell lines to identify transcripts that are mediated through or are expressed independently from local chromatin accessibility. We demonstrate that Bayesian model selection provides a powerful and versatile approach to identify causal relationships in genetic studies using model organism or human data

    Explaining variation across grantees in breast and cervical cancer screening proportions in the NBCCEDP

    Get PDF
    There is substantial variation across the National Breast and Cervical Cancer Early Detection Program (NBCCEDP) grantees in terms of the proportion of the eligible population served by the grantees each year (hereafter referred to as the screening proportion). In this paper, we assess program- and state-level factors to better understand the reason for this variation in breast and cervical cancer screening proportions across the NBCCEDP grantees

    Genetic Mapping of Multiple Metabolic Traits Identifies Novel Genes for Adiposity, Lipids and Insulin Secretory Capacity in Outbred Rats

    Get PDF
    Despite the successes of human genome-wide association studies, the causal genes underlying most metabolic traits remain unclear. We used outbred heterogeneous stock (HS) rats, coupled with expression data and mediation analysis, to identify quantitative trait loci (QTLs) and candidate gene mediators for adiposity, glucose tolerance, serum lipids, and other metabolic traits. Physiological traits were measured in 1519 male HS rats, with liver and adipose transcriptomes measured in over 410 rats. Genotypes were imputed from low coverage whole genome sequence. Linear mixed models were used to detect physiological and expression QTLs (pQTLs and eQTLs, respectively), employing both SNP- and haplotype-based models for pQTL mapping. Genes with cis-eQTLs that overlapped pQTLs were assessed as causal candidates through mediation analysis. We identified 14 SNP-based pQTLs and 19 haplotype-based pQTLs, of which 10 were in common. Using mediation, we identified the following genes as candidate mediators of pQTLs: Grk5 for a fat pad weight pQTL on Chr1, Krtcap3 for fat pad weight and serum lipids pQTLs on Chr6, Ilrun for a fat pad weight pQTL on Chr20 and Rfx6 for a whole pancreatic insulin content pQTL on Chr20. Furthermore, we verified Grk5 and Ktrcap3 using gene knock-down/out models, thereby shedding light on novel regulators of obesity

    GlobalFiler Express DNA amplification kit in South Africa: Extracting the past from the present

    Get PDF
    In this study, the GlobalFiler Express amplification kit was evaluated for forensic use in 541 South African individuals belonging to the Afrikaaner, amaXhosa,1 amaZulu,1 Asian Indian and Coloured population groups. Allelic frequencies, genetic diversity parameters and forensic informative metrics were calculated for each population. A total of 301 alleles were observed ranging between 5 and 44.2 repeat units, 43 were rarely observed partial repeats and seven were novel. The combined match probability (CMP) ranged from 2.21x10 (Coloured) to 5.21x10 (AmaZulu), and the combined power of exclusion (CPE) 0.9999999978 (Afrikaaner) to 0.99999999979 (AmaZulu) respectively. No significant departures from Hardy-Weinberg equilibrium (HWE) were observed after Bonferroni correction. Strong evidence of genetic structure was detected using the coancestry coefficient? Analysis of Molecular Variance (AMOVA) and an unsupervised Bayesian clustering method (STRUCTURE). The efficiency of assignment of individuals to population groups was evaluated by applying likelihood ratios with WHICHRUN, and the individual ancestral membership probabilities inferred by STRUCTURE. Likelihood ratios performed the best in the assignment of individuals to population groups. Signs of positive selection were detected for TH01 and D13S317 and purifying/balancing selection for locus SE33. These three loci also displayed the largest informativeness for assignment (In) values. The results of this study supports the use of the GlobalFiler STR profiling kit for forensic applications in South Africa with the additional capability to predict ethnicity or continental origin of a random sample.IS

    Determinants of QTL Mapping Power in the Realized Collaborative Cross

    No full text
    The Collaborative Cross (CC) is a mouse genetic reference population whose range of applications includes quantitative trait loci (QTL) mapping. The design of a CC QTL mapping study involves multiple decisions, including which and how many strains to use, and how many replicates per strain to phenotype, all viewed within the context of hypothesized QTL architecture. Until now, these decisions have been informed largely by early power analyses that were based on simulated, hypothetical CC genomes. Now that more than 50 CC strains are available and more than 70 CC genomes have been observed, it is possible to characterize power based on realized CC genomes. We report power analyses from extensive simulations and examine several key considerations: 1) the number of strains and biological replicates, 2) the QTL effect size, 3) the presence of population structure, and 4) the distribution of functionally distinct alleles among the founder strains at the QTL. We also provide general power estimates to aide in the design of future experiments. All analyses were conducted with our R package, SPARCC (Simulated Power Analysis in the Realized Collaborative Cross), developed for performing either large scale power analyses or those tailored to particular CC experiments

    A Bayesian model selection approach to mediation analysis.

    No full text
    Genetic studies often seek to establish a causal chain of events originating from genetic variation through to molecular and clinical phenotypes. When multiple phenotypes share a common genetic association, one phenotype may act as an intermediate for the genetic effects on the other. Alternatively, the phenotypes may be causally unrelated but share genetic loci. Mediation analysis represents a class of causal inference approaches used to determine which of these scenarios is most plausible. We have developed a general approach to mediation analysis based on Bayesian model selection and have implemented it in an R package, bmediatR. Bayesian model selection provides a flexible framework that can be tailored to different analyses. Our approach can incorporate prior information about the likelihood of models and the strength of causal effects. It can also accommodate multiple genetic variants or multi-state haplotypes. Our approach reports posterior probabilities that can be useful in interpreting uncertainty among competing models. We compared bmediatR with other popular methods, including the Sobel test, Mendelian randomization, and Bayesian network analysis using simulated data. We found that bmediatR performed as well or better than these alternatives in most scenarios. We applied bmediatR to proteome data from Diversity Outbred (DO) mice, a multi-parent population, and demonstrate the power of mediation with multi-state haplotypes. We also applied bmediatR to data from human cell lines to identify transcripts that are mediated through or are expressed independently from local chromatin accessibility. We demonstrate that Bayesian model selection provides a powerful and versatile approach to identify causal relationships in genetic studies using model organism or human data
    corecore