95 research outputs found

    Illustration of potential complications in interpreting direct and indirect effects.

    No full text
    <p>The three graphs a)-c) on the left illustrate different hypothetical scenarios that could all lead to the inference that both and are directly associated with (illustrated by the graph on the right). In each graph square nodes represent observed quantities, and circular nodes represent unobserved quantities. In a) both are indirectly associated with via an unmeasured factor, . In b) and are noisy observations of underlying variables and , where is associated indirectly by via . In c) is associated with indirectly via , and an unmeasured factor affects both of them. In all three cases both and are associated with , and, further, due to the existence of unmeasured variables, is conditionally dependent on given , leading to the inference (right) that both and are “directly” associated with.</p

    Illustration of three simple scenarios of association between genotype and a bivariate phenotype.

    No full text
    <p>All three scenarios involve positively-correlated bivariate response, which for concreteness we refer to as <i>height</i> (-axis) and <i>weight</i> (-axis). Each point represents an individual, colored according to their genotype (0, 1 or 2 copies of the minor allele). A) A variant associated with <i>weight</i> but not <i>height</i>. Even though <i>height</i> is unassociated, it nonetheless clearly helps to consider <i>weight</i> and <i>height</i> jointly in testing for association: the separation between genotype classes in the two-dimensional space is substantially greater than the separation along the axis alone. In fact, here the most powerful analysis would be the test for association with <i>weight</i>, controlling for <i>height</i>. B) The minor allele decreases <i>height</i> but increases <i>weight</i>: it is an allele for being “short and fat”. Here the three genotype classes are much better separated in the two-dimensional space, than for either phenotype individually. Should one be lucky enough to encounter such a genetic variant, a multivariate test would be considerably more powerful to detect it than either univariate test. C) Here the minor allele increases <i>height</i>, and <i>as a result</i> increases <i>weight</i>, resulting in what we will call an “indirect” association with <i>weight</i>. In this case the separation of the groups in the bivariate space is no greater than the separation along the axis alone, and the most powerful analysis would be a univariate test for association with <i>height</i>. In all panels, the differences among genotype classes were deliberately made very large for clarity of presentation.</p

    Table of genes from Global Lipids study [34], that, in our analysis, are best classified as being unassociated with one of the four lipid traits. (All other genes were best classified as being associated with all four lipid traits).

    No full text
    <p>The univariate associations in column 2 are the phenotypes reported as being associated with each SNP in the univariate analyses from <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0065245#pone.0065245-Teslovich1" target="_blank">[34]</a>. The posterior probability (column 4) shows the assessed probability that the listed trait (column 3) is actually unassociated.</p

    A Unified Framework for Association Analysis with Multiple Related Phenotypes

    Get PDF
    <div><p>We consider the problem of assessing associations between multiple related outcome variables, and a single explanatory variable of interest. This problem arises in many settings, including genetic association studies, where the explanatory variable is genotype at a genetic variant. We outline a framework for conducting this type of analysis, based on Bayesian model comparison and model averaging for multivariate regressions. This framework unifies several common approaches to this problem, and includes both standard univariate and standard multivariate association tests as special cases. The framework also unifies the problems of <i>testing</i> for associations and <i>explaining</i> associations – that is, identifying which outcome variables are associated with genotype. This provides an alternative to the usual, but conceptually unsatisfying, approach of resorting to univariate tests when explaining and interpreting significant multivariate findings. The method is computationally tractable genome-wide for modest numbers of phenotypes (e.g. 5–10), and can be applied to summary data, without access to raw genotype and phenotype data. We illustrate the methods on both simulated examples, and to a genome-wide association study of blood lipid traits where we identify 18 potential novel genetic associations that were not identified by univariate analyses of the same data.</p></div

    Comparison of Bayes Factors in simple bivariate simulations, correlation  = 0.4.

    No full text
    <p>The upper panel shows a typical simulated dataset under each of three scenarios (see text), but with effect sizes increased to aid clarity; each dot represents a single individual, colored according to genotype. Note that in the middle scenario only is associated with genotype. The lower panel compares and with a reference BF, which is the theoretical optimal for that simulation scenario. Thus one can see not only how the BFs compare with each other, but also the extent to which they lose compared with the optimal. Each point represents the results from a single simulation, and each simulation is represented by three points.</p

    Comparison of Bayes Factors in simple bivariate simulations, correlation  = 0.7.

    No full text
    <p>See caption to <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0065245#pone-0065245-g003" target="_blank">Figure? 3</a> for more details.</p

    Power comparison of different test statistics under different simulated (five dimensional) multivariate scenarios.

    No full text
    <p>Each line shows the power vs size for a different test statistic; the univariate tests ( and ANOVA) are indicated by dotted lines. See main text for details of each simulation scenario and the test statistics compared.</p

    Putative novel associations identified by multivariate analysis of the Global Lipids Data.

    No full text
    <p>All these SNPs have , and are more than 0.5Mb from any SNP identified in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0065245#pone.0065245-Teslovich1" target="_blank">[34]</a>. (*) These three SNPs map to a region of complex structure on chromosome 11 containing a large number of olfactory receptors, and are in LD with one another despite mapping Mb apart (possibly reflecting mapping errors).</p

    A graphical representation of the model corresponding to a partition

    No full text
    <p>. Each of the nodes represents a subset of the measured phenotypes . The simplest interpretation of the graph is as representing causal relationships among variables. In this interpretation a directed arrow from one node to another represents a direct causal effect, so, for example, the genotype has a direct causal effect on the variables , which in turn affects . A more flexible interpretation is in terms of the conditional independencies among variables that would result from such causal network. The rules for obtaining these conditional independencies involve the notion of -separation <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0065245#pone.0065245-Peer1" target="_blank">[63]</a>, which we do not go into here. Instead we simply note that the conditional independencies encoded by this graph include : is independent of ; and : is conditionally independent of given (because all paths from to go through or ). Note that the absence of any arrows in the direction from to is justified by our treating as a randomized intervention. (For those familiar with Directed Acyclic Graphical (DAG) models, here each node represents a collection of variables, and we allow for arbitrary correlations among the variables within each node. Thus, in a full DAG representation arrows would exist between all pairs of variables: arrows between variables in different nodes would go in the direction indicated by the figure. Arrows between variables within a node could go in any direction, subject to the constraint that the resulting graph must be acyclic.).</p

    Integrated Enrichment Analysis of Variants and Pathways in Genome-Wide Association Studies Indicates Central Role for IL-2 Signaling Genes in Type 1 Diabetes, and Cytokine Signaling Genes in Crohn's Disease

    Get PDF
    <div><p>Pathway analyses of genome-wide association studies aggregate information over sets of related genes, such as genes in common pathways, to identify gene sets that are <i>enriched</i> for variants associated with disease. We develop a model-based approach to pathway analysis, and apply this approach to data from the Wellcome Trust Case Control Consortium (WTCCC) studies. Our method offers several benefits over existing approaches. First, our method not only interrogates pathways for enrichment of disease associations, but also estimates the level of enrichment, which yields a coherent way to promote variants in enriched pathways, enhancing discovery of genes underlying disease. Second, our approach allows for multiple enriched pathways, a feature that leads to novel findings in two diseases where the major histocompatibility complex (MHC) is a major determinant of disease susceptibility. Third, by modeling disease as the combined effect of multiple markers, our method automatically accounts for linkage disequilibrium among variants. Interrogation of pathways from eight pathway databases yields strong support for enriched pathways, indicating links between Crohn's disease (CD) and cytokine-driven networks that modulate immune responses; between rheumatoid arthritis (RA) and “Measles” pathway genes involved in immune responses triggered by measles infection; and between type 1 diabetes (T1D) and IL2-mediated signaling genes. Prioritizing variants in these enriched pathways yields many additional putative disease associations compared to analyses without enrichment. For CD and RA, 7 of 8 additional non-MHC associations are corroborated by other studies, providing validation for our approach. For T1D, prioritization of IL-2 signaling genes yields strong evidence for 7 additional non-MHC candidate disease loci, as well as suggestive evidence for several more. Of the 7 strongest associations, 4 are validated by other studies, and 3 (near IL-2 signaling genes <i>RAF1</i>, <i>MAPK14</i>, and <i>FYN</i>) constitute novel putative T1D loci for further study.</p></div
    • …
    corecore