116 research outputs found

    Example of sensitivity analysis on a dataset which shows evidence for colocalisation at a predefined rule of posterior <i>P</i>(<i>H</i>4) > 0.5 only when the prior beliefs in H3 and H4 are approximately equal.

    No full text
    The left hand panels show local Manhattan plots for the two traits, while the right hand panels show prior and posterior probabilities for H0-H4 as a function of p12. The dashed vertical line indicates the value of p12 used in initial analysis (the value about which sensitivity is to be checked). H0 is omitted from the prior plot to enable the relative difference for the other hypotheses to be seen.</p

    Effects of varying <i>p</i><sub>12</sub> on the prior for <i>H</i><sub>4</sub> (coloured lines) compared to <i>H</i><sub>3</sub> (dashed line) as a function of the number of SNPs in the region.

    No full text
    For all plots p1 = p2 = 10−4 is constant. The coloured squares highlight points P(H3) = P(H4) for different p12.</p

    Average posterior probabilities for each hypothesis under different analysis strategies when trait 1 has two causal variants, A and B, and trait 2 has just one.

    No full text
    The left column shows the identity of causal variants for each trait and their relative effect sizes under four different models. The right column shows the average posterior that can be assigned to specific comparisons for of variants for trait 1: trait 2. We exploit our knowledge of the identity of the causal variants in simulated data to label each comparison according to LD between the lead SNP for each trait and the simulated causal variants. When labels cannot be unambiguously assigned (r2 < 0.8 with any causal variant) we use “?”.</p

    Example where the conditional coloc approach, run in iterative mode, finds misleading results.

    No full text
    a and b show the “observed” data (simulated from 1000 SNPs with MAF > 0.01) as -log10 p values for traits 1 and 2 respectively. Trait 1 has one causal variant, A, and trait 2 has two, A and B. Conditioning identifies a second independent signal for trait 2, and the results of conditioning on the strongest signal is shown in c. Coloc comparisons are based on (a, b) and (a, c) and both find the posterior probability (PP) of the shared causal variant hypothesis H4 is > 0.8. SuSiE analysis of the same data finds one credible set in trait 1, and log10 Bayes factors (BF) for this are shown in d. It finds two credible sets for trait 2, and the log10 BF for these are shown in e and f. Coloc comparisons are based on (d, e) and (d, f) and find PP of H4 of > 0.9 and −4 respectively. Blue and green points are used to highlight SNPs in LD with (r2 > 0.8) the true causal variants A and B respectively. The data underlying this figure are available in S1 Data.</p

    Average posterior probability distributions in simulated data.

    No full text
    The four classes of simulated datasets are shown in four rows, with the scenario indicated in the left hand column. For example, the top row shows a scenario where traits 1 and 2 have distinct causal variants A and B. Columns indicate the different analysis methods, with susie indicating SuSiE, cond_it indicating that coloc-conditioning was run in iterative mode, and cond_abo indicating it was run in “all but one” mode. For each simulation, the number of tests performed is at most 1 for “single”, or equal to the product of the number of signals detected for the other methods. For each test, we estimated which pair of variants were being tested according to the LD between the variant with highest fine-mapping posterior probability of causality for each trait and the true causal variants A and B. If r2 > 0.5 between the fine-mapped variant and true causal variant A, and r2 with A was higher than r2 with B, we labeled the test variant A, and vice versa for B. Where at least one test variant could not be unambigously assigned, we labelled the pair “?”. The total height of each bar represents the proportion of comparisons that were run for that variant pair, out of the number of simulations run, and typically does not reach 1 because there is not always power to perform all possible tests. Note that because we do not limit the number of tests, the height of the bar has the potential to exceed 1, but did not do so in practice. The shaded proportion of each bar corresponds to the average posterior for the indicated hypothesis, defined as the ratio of the sum of posterior probabilities for that hypothesis to the number of simulations performed. Recall that H0 indicates no associated variants for either trait, H1 and H2 a single causal variant for traits 1 and 2 respectively, H3 and H4 that both traits are associated with either distinct or shared causal variants, respectively. Each simulated region contains 1000 SNPs.</p

    Results of colocalisation simulations.

    No full text
    The columns shown are: scenario: the simulated causal variants in traits 1 and 2, for example A-AB indicates trait 1 has causal variant A and trait 2 has causal variants A and B. nsnps_in_region: Number of SNPs in simulated region (1000, 3000). method: method used for coloc analysis inferred_cv_pair estimated pair of causal variants under test. H0,H1,H2,H3,H4 average posterior support for each hypothesis. This is calculated as the sum of posterior probabilities for each hypothesis / number of simulations run. As some variant pairs are unlikely to be tested (eg the pair AA is unlikely to be tested in the scenario A-B) this is not the expected posterior support given AA is tested. (CSV)</p

    Companion to Fig 1, showing the results for simulated datasets with 3000 SNPs.

    No full text
    Legend otherwise as for Fig 1. (TIF)</p

    Masking as an alternative strategy to conditioning when attempting to colocalise trait signals with multiple causal variants in a region.

    No full text
    Top panel: input local Manhattan plots, with causal variants for each trait highlighted in red. We can use conditioning (left column) to perform multiple colocalisation analyses in a region. First, lead SNPs for each signal are identified through successively conditioning on selected SNPs and adding the most significant SNP out of the remainder, until some significance threshold is no longer reached. Then we condition on all but one lead SNP for each parallel coloc analysis. Note that when multiple lead SNPs are identified for each trait, eg n and m for traits 1 and 2 respectively, then n × m coloc analyses are performed. When an allele-aligned LD matrix is not available, an alternative is masking (right column) which differs by successively restricting the search space to SNPs not in LD with any lead SNPs instead of conditioning. Multiple coloc analyses are again performed, but setting the per SNP Bayes factor to 1 for hypotheses containing SNPs in LD with any but one of the lead SNPs. Note that for convenience of display, all SNPs in r2 > α with the lead SNP are assumed to be in a contiguous block, shaded gray.</p

    Example where the conditional coloc approach, run in “all but one” mode finds misleading results.

    No full text
    a and b show the observed data (-log10 p values) for traits 1 and 2 respectively. Conditioning identifies two independent signals for trait 2, and the results of conditioning on the signal closest to causal variants A and B are shown in c and d respectively. Coloc comparisons are based on (a, c) and then (a, d). SuSiE analysis of the same data finds one signal in trait 1, and log10 Bayes factors (BF) for this signal are shown in e. It finds two signals for trait 2, and the log10 BF for these are shown in f and g. Coloc comparisons are based on (e, f) and (e, g). The boxes on the lower plots show the results of running coloc analysis on that dataset against the data for trait 1 shown in a or e as appropriate. The data underlying this figure are available in S1 Data. (TIF)</p
    corecore