83 research outputs found
A pleiotropy-informed Bayesian false discovery rate adapted to a shared control design finds new disease associations from GWAS summary statistics.
Genome-wide association studies (GWAS) have been successful in identifying single nucleotide polymorphisms (SNPs) associated with many traits and diseases. However, at existing sample sizes, these variants explain only part of the estimated heritability. Leverage of GWAS results from related phenotypes may improve detection without the need for larger datasets. The Bayesian conditional false discovery rate (cFDR) constitutes an upper bound on the expected false discovery rate (FDR) across a set of SNPs whose p values for two diseases are both less than two disease-specific thresholds. Calculation of the cFDR requires only summary statistics and have several advantages over traditional GWAS analysis. However, existing methods require distinct control samples between studies. Here, we extend the technique to allow for some or all controls to be shared, increasing applicability. Several different SNP sets can be defined with the same cFDR value, and we show that the expected FDR across the union of these sets may exceed expected FDR in any single set. We describe a procedure to establish an upper bound for the expected FDR among the union of such sets of SNPs. We apply our technique to pairwise analysis of p values from ten autoimmune diseases with variable sharing of controls, enabling discovery of 59 SNP-disease associations which do not reach GWAS significance after genomic control in individual datasets. Most of the SNPs we highlight have previously been confirmed using replication studies or larger GWAS, a useful validation of our technique; we report eight SNP-disease associations across five diseases not previously declared. Our technique extends and strengthens the previous algorithm, and establishes robust limits on the expected FDR. This approach can improve SNP detection in GWAS, and give insight into shared aetiology between phenotypically related conditions.This work was funded by the JDRF (9-2011-253), the Wellcome Trust (061858 and 091157) and the NIHR Cambridge Biomedical Research Centre. The research leading to these results has received funding from the European Union's 7th Framework Programme (FP7/2007–2013) under grant agreement no. 241447 (NAIMIT). JL is funded by the NIHR Cambridge Biomedical Research Centre and is on the Wellcome Trust PhD programme in Mathematical Genomics and Medicine at the University of Cambridge. CW is funded by the Wellcome Trust (089989). The Cambridge Institute for Medical Research (CIMR) is in receipt of a Wellcome Trust Strategic Award (100140). ImmunoBase.org is supported by Eli-Lilly and Company. The use of DNA from the UK Blood Services collection of Common Controls (UKBS collection) was funded by the Wellcome Trust grant 076113/C/04/Z, by the Wellcome Trust/JDRF grant 061858, and by the National Institute of Health Research of England. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.This is the final published version of the article. It was originally published in PLOS Genetics (Liley J, Wallace C PLOS Genetics 2015, 11(2): e1004926. doi:10.1371/journal.pgen.1004926) http://dx.doi.org/10.1371/journal.pgen.1004926
Recommended from our members
Accurate error control in high-dimensional association testing using conditional false discovery rates.
High-dimensional hypothesis testing is ubiquitous in the biomedical sciences, and informative covariates may be employed to improve power. The conditional false discovery rate (cFDR) is a widely used approach suited to the setting where the covariate is a set of p-values for the equivalent hypotheses for a second trait. Although related to the Benjamini-Hochberg procedure, it does not permit any easy control of type-1 error rate and existing methods are over-conservative. We propose a new method for type-1 error rate control based on identifying mappings from the unit square to the unit interval defined by the estimated cFDR and splitting observations so that each map is independent of the observations it is used to test. We also propose an adjustment to the existing cFDR estimator which further improves power. We show by simulation that the new method more than doubles potential improvement in power over unconditional analyses compared to existing methods. We demonstrate our method on transcriptome-wide association studies and show that the method can be used in an iterative way, enabling the use of multiple covariates successively. Our methods substantially improve the power and applicability of cFDR analysis
Recommended from our members
Statistical co-analysis of high-dimensional association studies
Modern medical practice and science involve complex phenotypic definitions. Understanding patterns of association across this range of phenotypes requires co-analysis of high-dimensional association studies in order to characterise shared and distinct elements. In this thesis I address several problems in this area, with a general linking aim of making more efficient use of available data. The main application of these methods is in the analysis of genome-wide association studies (GWAS) and similar studies.
Firstly, I developed methodology for a Bayesian conditional false discovery rate (cFDR) for levering GWAS results using summary statistics from a related disease. I extended an existing method to enable a shared control design, increasing power and applicability, and developed an approximate bound on false-discovery rate (FDR) for the procedure. Using the new method I identified several new variant-disease associations. I then developed a second application of shared control design in the context of study replication, enabling improvement in power at the cost of changing the spectrum of sensitivity to systematic errors in study cohorts. This has application in studies on rare diseases or in between-case analyses.
I then developed a method for partially characterising heterogeneity within a disease by modelling the bivariate distribution of case-control and within-case effect sizes. Using an adaptation of a likelihood-ratio test, this allows an assessment to be made of whether disease heterogeneity corresponds to differences in disease pathology. I applied this method to a range of simulated and real datasets, enabling insight into the cause of heterogeneity in autoantibody positivity in type 1 diabetes (T1D). Finally, I investigated the relation of subtypes of juvenile idiopathic arthritis (JIA) to adult diseases, using modified genetic risk scores and linear discriminants in a penalised regression framework.
The contribution of this thesis is in a range of methodological developments in the analysis of high-dimensional association study comparison. Methods such as these will have wide application in the analysis of GWAS and similar areas, particularly in the development of stratified medicine.I was supported by a grant from the NIHR Cambridge BR
Accurate error control in high‐dimensional association testing using conditional false discovery rates
Funder: Johnson and Johnson; Id: http://dx.doi.org/10.13039/100004331Abstract: High‐dimensional hypothesis testing is ubiquitous in the biomedical sciences, and informative covariates may be employed to improve power. The conditional false discovery rate (cFDR) is a widely used approach suited to the setting where the covariate is a set of p‐values for the equivalent hypotheses for a second trait. Although related to the Benjamini–Hochberg procedure, it does not permit any easy control of type‐1 error rate and existing methods are over‐conservative. We propose a new method for type‐1 error rate control based on identifying mappings from the unit square to the unit interval defined by the estimated cFDR and splitting observations so that each map is independent of the observations it is used to test. We also propose an adjustment to the existing cFDR estimator which further improves power. We show by simulation that the new method more than doubles potential improvement in power over unconditional analyses compared to existing methods. We demonstrate our method on transcriptome‐wide association studies and show that the method can be used in an iterative way, enabling the use of multiple covariates successively. Our methods substantially improve the power and applicability of cFDR analysis
Recommended from our members
Accurate error control in high‐dimensional association testing using conditional false discovery rates
Funder: Johnson and Johnson; Id: http://dx.doi.org/10.13039/100004331Abstract: High‐dimensional hypothesis testing is ubiquitous in the biomedical sciences, and informative covariates may be employed to improve power. The conditional false discovery rate (cFDR) is a widely used approach suited to the setting where the covariate is a set of p‐values for the equivalent hypotheses for a second trait. Although related to the Benjamini–Hochberg procedure, it does not permit any easy control of type‐1 error rate and existing methods are over‐conservative. We propose a new method for type‐1 error rate control based on identifying mappings from the unit square to the unit interval defined by the estimated cFDR and splitting observations so that each map is independent of the observations it is used to test. We also propose an adjustment to the existing cFDR estimator which further improves power. We show by simulation that the new method more than doubles potential improvement in power over unconditional analyses compared to existing methods. We demonstrate our method on transcriptome‐wide association studies and show that the method can be used in an iterative way, enabling the use of multiple covariates successively. Our methods substantially improve the power and applicability of cFDR analysis
Model updating after interventions paradoxically introduces bias
Machine learning is increasingly being used to generate prediction models for
use in a number of real-world settings, from credit risk assessment to clinical
decision support. Recent discussions have highlighted potential problems in the
updating of a predictive score for a binary outcome when an existing predictive
score forms part of the standard workflow, driving interventions. In this
setting, the existing score induces an additional causative pathway which leads
to miscalibration when the original score is replaced. We propose a general
causal framework to describe and address this problem, and demonstrate an
equivalent formulation as a partially observed Markov decision process. We use
this model to demonstrate the impact of such `naive updating' when performed
repeatedly. Namely, we show that successive predictive scores may converge to a
point where they predict their own effect, or may eventually tend toward a
stable oscillation between two values, and we argue that neither outcome is
desirable. Furthermore, we demonstrate that even if model-fitting procedures
improve, actual performance may worsen. We complement these findings with a
discussion of several potential routes to overcome these issues.Comment: Sections of this preprint on 'Successive adjuvancy' (section 4,
theorem 2, figures 4,5, and associated discussions) were not included in the
originally submitted version of this paper due to length. This material does
not appear in the published version of this manuscript, and the reader should
be aware that these sections did not undergo peer revie
Effects of a novel, brief psychological therapy (Managing Unusual Sensory Experiences) for hallucinations in first episode psychosis (MUSE FEP): findings from an exploratory randomised controlled trial.
Hallucinations are a common feature of psychosis, yet access to effective psychological treatment is limited. The Managing Unusual Sensory Experiences for First-Episode-Psychosis (MUSE-FEP) trial aimed to establish the feasibility and acceptability of a brief, hallucination-specific, digitally provided treatment, delivered by a non-specialist workforce for people with psychosis. MUSE uses psychoeducation about the causal mechanisms of hallucinations and tailored interventions to help a person understand and manage their experiences. We undertook a two-site, single-blind (rater) Randomised Controlled Trial and recruited 82 participants who were allocated 1:1 to MUSE and treatment as usual (TAU) (n=40) or TAU alone (n=42). Participants completed assessments before and after treatment (2 months), and at follow up (3-4 months). Information on recruitment rates, adherence, and completion of outcome assessments was collected. Analyses focussed on feasibility outcomes and initial estimates of intervention effects to inform a future trial. The trial is registered with the ISRCTN registry 16793301. Criteria for the feasibility of trial methodology and intervention delivery were met. The trial exceeded the recruitment target, had high retention rates (87.8%) at end of treatment, and at follow up (86.6%), with good acceptability of treatment. There were 3 serious adverse events in the therapy group, and 5 in the TAU group. Improvements were evident in both groups at the end of treatment and follow up, with a particular benefit in perceived recovery in the MUSE group. We showed it was feasible to increase access to psychological intervention but a definitive trial requires further changes to the trial design or treatment
A method for identifying genetic heterogeneity within phenotypically defined disease subgroups.
Many common diseases show wide phenotypic variation. We present a statistical method for determining whether phenotypically defined subgroups of disease cases represent different genetic architectures, in which disease-associated variants have different effect sizes in two subgroups. Our method models the genome-wide distributions of genetic association statistics with mixture Gaussians. We apply a global test without requiring explicit identification of disease-associated variants, thus maximizing power in comparison to standard variant-by-variant subgroup analysis. Where evidence for genetic subgrouping is found, we present methods for post hoc identification of the contributing genetic variants. We demonstrate the method on a range of simulated and test data sets, for which expected results are already known. We investigate subgroups of individuals with type 1 diabetes (T1D) defined by autoantibody positivity, establishing evidence for differential genetic architecture with positivity for thyroid-peroxidase-specific antibody, driven generally by variants in known T1D-associated genomic regions.We acknowledge the help of the Diabetes and Inflammation Laboratory Data Service for access and quality control procedures on the data sets used in this study. The JDRF/Wellcome Trust Diabetes and Inflammation Laboratory is in receipt of a Wellcome Trust Strategic Award (107212; J.A.T.) and receives funding from the NIHR Cambridge Biomedical Research Centre. J.L. is funded by the NIHR Cambridge Biomedical Research Centre and is on the Wellcome Trust PhD program in Mathematical Genomics and Medicine at the University of Cambridge. C.W. is funded by the MRC (grant MC_UP_1302/5). We thank M. Simmonds, S. Gough, J. Franklyn, and O. Brand for sharing their AITD genetic association data set and all patients with AITD and control subjects for participating in this study. The AITD UK national collection was funded by the Wellcome Trust. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
Large‐Amplitude Mountain Waves in the Mesosphere Observed on 21 June 2014 During DEEPWAVE: 1.Wave Development, Scales, Momentum Fluxes, and Environmental Sensitivity
A remarkable, large‐amplitude, mountain wave (MW) breaking event was observed on the night of 21 June 2014 by ground‐based optical instruments operated on the New Zealand South Island during the Deep Propagating Gravity Wave Experiment (DEEPWAVE). Concurrent measurements of the MW structures, amplitudes, and background environment were made using an Advanced Mesospheric Temperature Mapper, a Rayleigh Lidar, an All‐Sky Imager, and a Fabry‐Perot Interferometer. The MW event was observed primarily in the OH airglow emission layer at an altitude of ~82 km, over an ~2‐hr interval (~10:30–12:30 UT), during strong eastward winds at the OH altitude and above, which weakened with time. The MWs displayed dominant horizontal wavelengths ranging from ~40 to 70 km and temperature perturbation amplitudes as large as ~35 K. The waves were characterized by an unusual, “saw‐tooth” pattern in the larger‐scale temperature field exhibiting narrow cold phases separating much broader warm phases with increasing temperatures toward the east, indicative of strong overturning and instability development. Estimates of the momentum fluxes during this event revealed a distinct periodicity (~25 min) with three well‐defined peaks ranging from ~600 to 800 m2/s2, among the largest ever inferred at these altitudes. These results suggest that MW forcing at small horizontal scales (km) can play large roles in the momentum budget of the mesopause region when forcing and propagation conditions allow them to reach mesospheric altitudes with large amplitudes. A detailed analysis of the instability dynamics accompanying this breaking MW event is presented in a companion paper, Fritts et al. (2019, https://doi.org/10.1029/2019jd030899)
- …