214 research outputs found

    Sharing extended summary data from contemporary genetics studies is unlikely to threaten subject privacy

    Get PDF
    Background Starting from a forensic problem, Homer et al. showed that it was possible to detect if an individual contributes only 0.5% of the DNA in a pool. The finding was extended to prove the possibility of detecting whether a subject participated in a small homogeneous GWAS. We denote this as the detection of a subject belonging to a certain cohort (SBCC). Subsequently, Visscher and Hill showed that the power to detect SBCC signal for an ethnically homogeneous cohort depends roughly on the ratio of the number of independent markers and total sample size. However, it is not clear if the same holds for more ethnically diverse cohorts. Later, Masca et al. propose running as SBCC test a regression of departure from assumed population frequency of i) subject genotype on ii) cohort of interest frequency. They use simulations to show that the approach has better SBCC detection power than the original Homer method but is impeded by population stratification. Approach To investigate the possibility of SBCC detection in multi-ethnic cohorts, we generalize the Masca et al. approach by theoretically deriving the correlation between a subject genotype and the cohort reference allele frequencies (RAFs) for stratified cohorts. Based on the derived formula, we theoretically show that, due to background stratification noise, SBCC detection is unlikely even for mildly stratified cohorts of size greater than around a thousand subjects. Thus, for the vast majority of contemporary cohorts, the fear of compromising privacy via SBCC detection is unfounded

    Association Testing Strategy for Data from Dense Marker Panels

    Get PDF
    Genome wide association studies have been usually analyzed in a univariate manner. The commonly used univariate tests have one degree of freedom and assume an additive mode of inheritance. The experiment-wise significance of these univariate statistics is obtained by adjusting for multiple testing. Next generation sequencing studies, which assay 10-20 million variants, are beginning to come online. For these studies, the strategy of additive univariate testing and multiple testing adjustment is likely to result in a loss of power due to (1) the substantial multiple testing burden and (2) the possibility of a non-additive causal mode of inheritance. To reduce the power loss we propose: a new method (1) to summarize in a single statistic the strength of the association signals coming from all not-very-rare variants in a linkage disequilibrium block and (2) to incorporate, in any linkage disequilibrium block statistic, the strength of the association signals under multiple modes of inheritance. The proposed linkage disequilibrium block test consists of the sum of squares of nominally significant univariate statistics. We compare the performance of this method to the performance of existing linkage disequilibrium block/gene-based methods. Simulations show that (1) extending methods to combine testing for multiple modes of inheritance leads to substantial power gains, especially for a recessive mode of inheritance, and (2) the proposed method has a good overall performance. Based on simulation results, we provide practical advice on choosing suitable methods for applied analyses

    RNA-Seq analysis implicates dysregulation of the immune system in schizophrenia

    Get PDF
    Background While genome-wide association studies identified some promising candidates for schizophrenia, the majority of risk genes remained unknown. We were interested in testing whether integration gene expression and other functional information could facilitate the identification of susceptibility genes and related biological pathways. Results We conducted high throughput sequencing analyses to evaluate mRNA expression in blood samples isolated from 3 schizophrenia patients and 3 healthy controls. We also conducted pooled sequencing of 10 schizophrenic patients and matched controls. Differentially expressed genes were identified by t-test. In the individually sequenced dataset, we identified 198 genes differentially expressed between cases and controls, of them 19 had been verified by the pooled sequencing dataset and 21 reached nominal significance in gene-based association analyses of a genome wide association dataset. Pathway analysis of these differentially expressed genes revealed that they were highly enriched in the immune related pathways. Two genes, S100A8 and TYROBP, had consistent changes in expression in both individual and pooled sequencing datasets and were nominally significant in gene-based association analysis. Conclusions Integration of gene expression and pathway analyses with genome-wide association may be an efficient approach to identify risk genes for schizophrenia

    DISTMIX: direct imputation of summary statistics for unmeasured SNPs from mixed ethnicity cohorts

    Get PDF
    Motivation: To increase the signal resolution for large-scale meta-analyses of genome-wide association studies, genotypes at unmeasured single nucleotide polymorphisms (SNPs) are commonly imputed using large multi-ethnic reference panels. However, the ever increasing size and ethnic diversity of both reference panels and cohorts makes genotype imputation computationally challenging for moderately sized computer clusters. Moreover, genotype imputation requires subject-level genetic data, which unlike summary statistics provided by virtually all studies, is not publicly available. While there are much less demanding methods which avoid the genotype imputation step by directly imputing SNP statistics, e.g. Directly Imputing summary STatistics (DIST) proposed by our group, their implicit assumptions make them applicable only to ethnically homogeneous cohorts. Results: To decrease computational and access requirements for the analysis of cosmopolitan cohorts, we propose DISTMIX, which extends DIST capabilities to the analysis of mixed ethnicity cohorts. The method uses a relevant reference panel to directly impute unmeasured SNP statistics based only on statistics at measured SNPs and estimated/user-specified ethnic proportions. Simulations show that the proposed method adequately controls the Type I error rates. The 1000 Genomes panel imputation of summary statistics from the ethnically diverse Psychiatric Genetic Consortium Schizophrenia Phase 2 suggests that, when compared to genotype imputation methods, DISTMIX offers comparable imputation accuracy for only a fraction of computational resources

    JEPEGMIX2: improved gene-level joint analysis of eQTLs in cosmopolitan cohorts.

    Get PDF
    Motivation: To increase detection power, researchers use gene level analysis methods to aggregate weak marker signals. Due to gene expression controlling biological processes, researchers proposed aggregating signals for expression Quantitative Trait Loci (eQTL). Most gene-level eQTL methods make statistical inferences based on i) summary statistics from genome-wide association studies (GWAS) and ii) linkage disequilibrium (LD) patterns from a relevant reference panel. While most such tools assume homogeneous cohorts, our Gene-level Joint Analysis of functional SNPs in Cosmopolitan Cohorts (JEPEGMIX) method accommodates cosmopolitan cohorts by using heterogeneous panels. However, JEPGMIX relies on brain eQTLs from older gene expression studies and does not adjust for background enrichment in GWAS signals. Results: We propose JEPEGMIX2, an extension of JEPEGMIX. When compared to JPEGMIX, it uses i) cis-eQTL SNPs from the latest expression studies and ii) brains specific (sub)tissues and tissues other than brain. JEPEGMIX2 also i) avoids accumulating averagely enriched polygenic information by adjusting for background enrichment and ii), to avoid an increase in false positive rates for studies with numerous highly enriched (above the background) genes, it outputs gene q-values based on Holm adjustment of p-values. Supplementary information: Supplementary material is available at Bioinformatics online. Bioinformatics 2018; 34(2):286-28

    Linkage analysis of anorexia and bulimia nervosa cohorts using selected behavioral phenotypes as quantitative traits or covariates

    Get PDF
    To increase the likelihood of finding genetic variation conferring liability to eating disorders, we measured over 100 attributes thought to be related to liability to eating disorders on affected individuals from multiplex families and two cohorts: one recruited through a proband with anorexia nervosa (AN; AN cohort); the other recruited through a proband with bulimia nervosa (BN; BN cohort). By a multilayer decision process based on expert evaluation and statistical analysis, six traits were selected for linkage analysis (1): obsessionality (OBS), age at menarche (MENAR) and anxiety (ANX) for quantitative trait locus (QTL) linkage analysis; and lifetime minimum Body Mass Index (BMI), concern over mistakes (CM) and food-related obsessions (OBF) for covariate-based linkage analysis. The BN cohort produced the largest linkage signals: for QTL linkage analysis, four suggestive signals: (for MENAR, at 10p13; for ANX, at 1q31.1, 4q35.2, and 8q13.1); for covariate-based linkage analyses, both significant and suggestive linkages (for BMI, one significant [4q21.1] and three suggestive [3p23, 10p13, 5p15.3]; for CM, two significant [16p13.3, 14q21.1] and three suggestive [4p15.33, 8q11.23, 10p11.21]; and for OBF, one significant [14q21.1] and five suggestive [4p16.1, 10p13.1, 8q11.23, 16p13.3, 18p11.31]). Results from the AN cohort were far less compelling: for QTL linkage analysis, two suggestive signals (for OBS at 6q21 and for ANX at 9p21.3); for covariate-based linkage analysis, five suggestive signals (for BMI at 4q13.1, for CM at 11p11.2 and 17q25.1, and for OBF at 17q25.1 and 15q26.2). Overlap between the two cohorts was minimal for substantial linkage signals

    Genetic Relationship between Schizophrenia and Nicotine Dependence

    Get PDF
    It is well known that most schizophrenia patients smoke cigarettes. There are different hypotheses postulating the underlying mechanisms of this comorbidity. We used summary statistics from large meta-analyses of plasma cotinine concentration (COT), Fagerström test for nicotine dependence (FTND) and schizophrenia to examine the genetic relationship between these traits. We found that schizophrenia risk scores calculated at P-value thresholds of 5 × 10−3 and larger predicted FTND and cigarettes smoked per day (CPD), suggesting that genes most significantly associated with schizophrenia were not associated with FTND/CPD, consistent with the self-medication hypothesis. The COT risk scores predicted schizophrenia diagnosis at P-values of 5 × 10−3 and smaller, implying that genes most significantly associated with COT were associated with schizophrenia. These results implicated that schizophrenia and FTND/CPD/COT shared some genetic liability. Based on this shared liability, we identified multiple long non-coding RNAs and RNA binding protein genes (DA376252, BX089737, LOC101927273, LINC01029, LOC101928622, HY157071, DA902558, RBFOX1 and TINCR), protein modification genes (MANBA, UBE2D3, and RANGAP1) and energy production genes (XYLB, MTRF1 and ENOX1) that were associated with both conditions. Further analyses revealed that these shared genes were enriched in calcium signaling, long-term potentiation and neuroactive ligand-receptor interaction pathways that played a critical role in cognitive functions and neuronal plasticity.</p

    Genome-Wide Gene-Environment Study Identifies Glutamate Receptor Gene GRIN2A as a Parkinson's Disease Modifier Gene via Interaction with Coffee

    Get PDF
    Our aim was to identify genes that influence the inverse association of coffee with the risk of developing Parkinson's disease (PD). We used genome-wide genotype data and lifetime caffeinated-coffee-consumption data on 1,458 persons with PD and 931 without PD from the NeuroGenetics Research Consortium (NGRC), and we performed a genome-wide association and interaction study (GWAIS), testing each SNP's main-effect plus its interaction with coffee, adjusting for sex, age, and two principal components. We then stratified subjects as heavy or light coffee-drinkers and performed genome-wide association study (GWAS) in each group. We replicated the most significant SNP. Finally, we imputed the NGRC dataset, increasing genomic coverage to examine the region of interest in detail. The primary analyses (GWAIS, GWAS, Replication) were performed using genotyped data. In GWAIS, the most significant signal came from rs4998386 and the neighboring SNPs in GRIN2A. GRIN2A encodes an NMDA-glutamate-receptor subunit and regulates excitatory neurotransmission in the brain. Achieving P2df = 10−6, GRIN2A surpassed all known PD susceptibility genes in significance in the GWAIS. In stratified GWAS, the GRIN2A signal was present in heavy coffee-drinkers (OR = 0.43; P = 6×10−7) but not in light coffee-drinkers. The a priori Replication hypothesis that “Among heavy coffee-drinkers, rs4998386_T carriers have lower PD risk than rs4998386_CC carriers” was confirmed: ORReplication = 0.59, PReplication = 10−3; ORPooled = 0.51, PPooled = 7×10−8. Compared to light coffee-drinkers with rs4998386_CC genotype, heavy coffee-drinkers with rs4998386_CC genotype had 18% lower risk (P = 3×10−3), whereas heavy coffee-drinkers with rs4998386_TC genotype had 59% lower risk (P = 6×10−13). Imputation revealed a block of SNPs that achieved P2df<5×10−8 in GWAIS, and OR = 0.41, P = 3×10−8 in heavy coffee-drinkers. This study is proof of concept that inclusion of environmental factors can help identify genes that are missed in GWAS. Both adenosine antagonists (caffeine-like) and glutamate antagonists (GRIN2A-related) are being tested in clinical trials for treatment of PD. GRIN2A may be a useful pharmacogenetic marker for subdividing individuals in clinical trials to determine which medications might work best for which patients

    A large-scale genome-wide association study meta-analysis of cannabis use disorder

    Get PDF
    Summary Background Variation in liability to cannabis use disorder has a strong genetic component (estimated twin and family heritability about 50–70%) and is associated with negative outcomes, including increased risk of psychopathology. The aim of the study was to conduct a large genome-wide association study (GWAS) to identify novel genetic variants associated with cannabis use disorder. Methods To conduct this GWAS meta-analysis of cannabis use disorder and identify associations with genetic loci, we used samples from the Psychiatric Genomics Consortium Substance Use Disorders working group, iPSYCH, and deCODE (20 916 case samples, 363 116 control samples in total), contrasting cannabis use disorder cases with controls. To examine the genetic overlap between cannabis use disorder and 22 traits of interest (chosen because of previously published phenotypic correlations [eg, psychiatric disorders] or hypothesised associations [eg, chronotype] with cannabis use disorder), we used linkage disequilibrium score regression to calculate genetic correlations. Findings We identified two genome-wide significant loci: a novel chromosome 7 locus (FOXP2, lead single-nucleotide polymorphism [SNP] rs7783012; odds ratio [OR] 1·11, 95% CI 1·07–1·15, p=1·84 × 10−9) and the previously identified chromosome 8 locus (near CHRNA2 and EPHX2, lead SNP rs4732724; OR 0·89, 95% CI 0·86–0·93, p=6·46 × 10−9). Cannabis use disorder and cannabis use were genetically correlated (rg 0·50, p=1·50 × 10−21), but they showed significantly different genetic correlations with 12 of the 22 traits we tested, suggesting at least partially different genetic underpinnings of cannabis use and cannabis use disorder. Cannabis use disorder was positively genetically correlated with other psychopathology, including ADHD, major depression, and schizophrenia. Interpretation These findings support the theory that cannabis use disorder has shared genetic liability with other psychopathology, and there is a distinction between genetic liability to cannabis use and cannabis use disorder. Funding National Institute of Mental Health; National Institute on Alcohol Abuse and Alcoholism; National Institute on Drug Abuse; Center for Genomics and Personalized Medicine and the Centre for Integrative Sequencing; The European Commission, Horizon 2020; National Institute of Child Health and Human Development; Health Research Council of New Zealand; National Institute on Aging; Wellcome Trust Case Control Consortium; UK Research and Innovation Medical Research Council (UKRI MRC); The Brain & Behavior Research Foundation; National Institute on Deafness and Other Communication Disorders; Substance Abuse and Mental Health Services Administration (SAMHSA); National Institute of Biomedical Imaging and Bioengineering; National Health and Medical Research Council (NHMRC) Australia; Tobacco-Related Disease Research Program of the University of California; Families for Borderline Personality Disorder Research (Beth and Rob Elliott) 2018 NARSAD Young Investigator Grant; The National Child Health Research Foundation (Cure Kids); The Canterbury Medical Research Foundation; The New Zealand Lottery Grants Board; The University of Otago; The Carney Centre for Pharmacogenomics; The James Hume Bequest Fund; National Institutes of Health: Genes, Environment and Health Initiative; National Institutes of Health; National Cancer Institute; The William T Grant Foundation; Australian Research Council; The Virginia Tobacco Settlement Foundation; The VISN 1 and VISN 4 Mental Illness Research, Education, and Clinical Centers of the US Department of Veterans Affairs; The 5th Framework Programme (FP-5) GenomEUtwin Project; The Lundbeck Foundation; NIH-funded Shared Instrumentation Grant S10RR025141; Clinical Translational Sciences Award grants; National Institute of Neurological Disorders and Stroke; National Heart, Lung, and Blood Institute; National Institute of General Medical Sciences.Peer reviewe
    corecore