Search CORE

14,400 research outputs found

The effect of rare variants on inflation of the test statistics in case-control analyses.

Author: Lush Michael
Pharoah Paul DP
Pirie Ailith
Tyrer Jonathan
Wood Angela
Publication venue: BMC Bioinformatics
Publication date: 01/01/2015
Field of study

BACKGROUND: The detection of bias due to cryptic population structure is an important step in the evaluation of findings of genetic association studies. The standard method of measuring this bias in a genetic association study is to compare the observed median association test statistic to the expected median test statistic. This ratio is inflated in the presence of cryptic population structure. However, inflation may also be caused by the properties of the association test itself particularly in the analysis of rare variants. We compared the properties of the three most commonly used association tests: the likelihood ratio test, the Wald test and the score test when testing rare variants for association using simulated data. RESULTS: We found evidence of inflation in the median test statistics of the likelihood ratio and score tests for tests of variants with less than 20 heterozygotes across the sample, regardless of the total sample size. The test statistics for the Wald test were under-inflated at the median for variants below the same minor allele frequency. CONCLUSIONS: In a genetic association study, if a substantial proportion of the genetic variants tested have rare minor allele frequencies, the properties of the association test may mask the presence or absence of bias due to population structure. The use of either the likelihood ratio test or the score test is likely to lead to inflation in the median test statistic in the absence of population structure. In contrast, the use of the Wald test is likely to result in under-inflation of the median test statistic which may mask the presence of population structure.This work was supported by a grant from Cancer Research UK (C490/A16561). AP is funded by a Medical Research Council studentship.This is the final published version. It first appeared at http://dx.doi.org/10.1186%2Fs12859-015-0496-1

Crossref

Springer - Publisher Connector

PubMed Central

Apollo (Cambridge)

Recommended from our members

Refining the accuracy of validated target identification through coding variant fine-mapping in type 2 diabetes.

Author: Afaq Saima
Afzal Shoaib
Ahlqvist Emma
Almgren Peter
Amin Najaf
An Ping
Bang Lia B
Bertoni Alain G
Bielak Lawrence F
Bombieri Cristina
Bork-Jensen Jette
Brandslund Ivan
Brody Jennifer A
Burtt Noël P
Canouil Mickaël
Chen Yii-Der Ida
Cho Yoon Shin
Christensen Cramer
Chu Audrey Y
Cook James P
de Haan Hugoline G
Demirkan Ayse
Eastwood Sophie V
Eckardt Kai-Uwe
ExomeBP Consortium
Fischer Krista
Flannick Jason
Gambaro Giovanni
Gan Wei
GIANT Consortium
Giedraitis Vilmantas
Graff Marielisa
Grarup Niels
Grove Megan L
Guo Xiuqing
Gustafsson Stefan
Hackinger Sophie
Hai Yang
Han Sohee
Highland Heather M
Hivert Marie-France
Hu Yao
Huo Shaofeng
Isomaa Bo
Jensen Richard A
Justice Anne E
Jäger Susanne
Jørgensen Marit E
Jørgensen Torben
Kim Bong-Jo
Kim Sung Soo
Kim Young Jin
Kitajima Hidetoshi
Koistinen Heikki A
Kovacs Peter
Kravic Jasmina
Kriebel Jennifer
Kronenberg Florian
Käräjämäki Annemari
Lange Leslie A
Lecoeur Cécile
Lee Jung-Jin
Lehne Benjamin
Li Huaixing
Li Jin
Li Man
Li-Gao Ruifang
Ligthart Symen
Lin Keng-Hung
Liu Dajiang J
Lohman Kurt K
Lu Yingchang
Läll Kristi
MAGIC Consortium
Mahajan Anubha
Malerba Giovanni
Marouli Eirini
Marten Jonathan
Meidtner Karina
Müller-Nurasyid Martina
Peloso Gina Marie
Preuss Michael
Prins Bram Peter
Rayner N William
Robertson Neil R
Rybin Denis V
Smith Albert Vernon
Steinthorsdottir Valgerdur
Tajes Juan Fernandez
Taliun Daniel
Trubetskoy Vassily Vladimirovich
Tybjærg-Hansen Anne
Varga Tibor V
Warren Helen R
Wessel Jennifer
Willems Sara M
Wuttke Matthias
Yaghootkar Hanieh
Zhang Weihua
Zhao Wei
Publication venue: eScholarship, University of California
Publication date: 01/04/2018
Field of study

We aggregated coding variant data for 81,412 type 2 diabetes cases and 370,832 controls of diverse ancestry, identifying 40 coding variant association signals (P < 2.2 × 10-7); of these, 16 map outside known risk-associated loci. We make two important observations. First, only five of these signals are driven by low-frequency variants: even for these, effect sizes are modest (odds ratio ≤1.29). Second, when we used large-scale genome-wide association data to fine-map the associated variants in their regional context, accounting for the global enrichment of complex trait associations in coding sequence, compelling evidence for coding variant causality was obtained for only 16 signals. At 13 others, the associated coding variants clearly represent 'false leads' with potential to generate erroneous mechanistic inference. Coding variant associations offer a direct route to biological insight for complex diseases and identification of validated therapeutic targets; however, appropriate mechanistic inference requires careful specification of their causal contribution to disease predisposition

eScholarship - University of California

Accurate Liability Estimation Improves Power in Ascertained Case Control Studies

Author: AL Price
AL Price
C Lippert
C Widmer
Christoph Lippert
D Golan
D Welter
Dan Geiger
David Heckerman
DJ Balding
ER Dempster
J Listgarten
J Yang
J Yang
J Yang
LA Hindorff
LC Tsoi
M Fakiola
N Fusi
N Patterson
N Zaitlen
N Zaitlen
Omer Weissbrod
S Sawcer
S Wright
SH Lee
X Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2015
Field of study

Linear mixed models (LMMs) have emerged as the method of choice for confounded genome-wide association studies. However, the performance of LMMs in non-randomly ascertained case-control studies deteriorates with increasing sample size. We propose a framework called LEAP (Liability Estimator As a Phenotype, https://github.com/omerwe/LEAP) that tests for association with estimated latent values corresponding to severity of phenotype, and demonstrate that this can lead to a substantial power increase

arXiv.org e-Print Archive

Crossref

MDC Repository

Using GWAS Data to Identify Copy Number Variants Contributing to Common Complex Diseases

Author: Teslovich Tanya M.
Zöllner Sebastian
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 25/10/2010
Field of study

Copy number variants (CNVs) account for more polymorphic base pairs in the human genome than do single nucleotide polymorphisms (SNPs). CNVs encompass genes as well as noncoding DNA, making these polymorphisms good candidates for functional variation. Consequently, most modern genome-wide association studies test CNVs along with SNPs, after inferring copy number status from the data generated by high-throughput genotyping platforms. Here we give an overview of CNV genomics in humans, highlighting patterns that inform methods for identifying CNVs. We describe how genotyping signals are used to identify CNVs and provide an overview of existing statistical models and methods used to infer location and carrier status from such data, especially the most commonly used methods exploring hybridization intensity. We compare the power of such methods with the alternative method of using tag SNPs to identify CNV carriers. As such methods are only powerful when applied to common CNVs, we describe two alternative approaches that can be informative for identifying rare CNVs contributing to disease risk. We focus particularly on methods identifying de novo CNVs and show that such methods can be more powerful than case-control designs. Finally we present some recommendations for identifying CNVs contributing to common complex disorders.Comment: Published in at http://dx.doi.org/10.1214/09-STS304 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Fast and accurate imputation of summary statistics enhances evidence of functional enrichment

Author: Bhatia Gaurav
Gusev Alexander
Hirschhorn Joel
Pasaniuc Bogdan
Patterson Nick
Pickrell Joseph
Price Alkes L.
Shi Huwenbo
Strachan David P
Zaitlen Noah
Publication venue: 'Oxford University Press (OUP)'
Publication date: 12/09/2013
Field of study

Imputation using external reference panels is a widely used approach for increasing power in GWAS and meta-analysis. Existing HMM-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available. In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1-5%) variants (increasing to 87% (60%) when summary LD information is available from target samples) versus 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and is computationally very fast. As an empirical demonstration, we apply our method to 7 case-control phenotypes from the WTCCC data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of

\chi^2

association statistics) compared to HMM-based imputation from individual-level genotypes at the 227 (176) published SNPs in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of 4 lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic vs. non-genic loci for these traits, as compared to an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses.Comment: 32 pages, 4 figure

arXiv.org e-Print Archive

Crossref

PubMed Central

eScholarship - University of California

Population Structure and Cryptic Relatedness in Genetic Association Studies

Author: Astle William
Balding David J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

We review the problem of confounding in genetic association studies, which arises principally because of population structure and cryptic relatedness. Many treatments of the problem consider only a simple ``island'' model of population structure. We take a broader approach, which views population structure and cryptic relatedness as different aspects of a single confounder: the unobserved pedigree defining the (often distant) relationships among the study subjects. Kinship is therefore a central concept, and we review methods of defining and estimating kinship coefficients, both pedigree-based and marker-based. In this unified framework we review solutions to the problem of population structure, including family-based study designs, genomic control, structured association, regression control, principal components adjustment and linear mixed models. The last solution makes the most explicit use of the kinships among the study subjects, and has an established role in the analysis of animal and plant breeding studies. Recent computational developments mean that analyses of human genetic association data are beginning to benefit from its powerful tests for association, which protect against population structure and cryptic kinship, as well as intermediate levels of confounding by the pedigree.Comment: Published in at http://dx.doi.org/10.1214/09-STS307 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

OpenGrey Repository

University of Melbourne Institutional Repository

Accounting for Population Structure in Gene-by-Environment Interactions in Genome-Wide Association Studies Using Mixed Models.

Author: Bilow Michael
Eskin Eleazar
Furlotte Nick
He Dan
Kostem Emrah
Sul Jae Hoon
Yang Wen-Yun
Publication venue: eScholarship, University of California
Publication date: 01/03/2016
Field of study

Although genome-wide association studies (GWASs) have discovered numerous novel genetic variants associated with many complex traits and diseases, those genetic variants typically explain only a small fraction of phenotypic variance. Factors that account for phenotypic variance include environmental factors and gene-by-environment interactions (GEIs). Recently, several studies have conducted genome-wide gene-by-environment association analyses and demonstrated important roles of GEIs in complex traits. One of the main challenges in these association studies is to control effects of population structure that may cause spurious associations. Many studies have analyzed how population structure influences statistics of genetic variants and developed several statistical approaches to correct for population structure. However, the impact of population structure on GEI statistics in GWASs has not been extensively studied and nor have there been methods designed to correct for population structure on GEI statistics. In this paper, we show both analytically and empirically that population structure may cause spurious GEIs and use both simulation and two GWAS datasets to support our finding. We propose a statistical approach based on mixed models to account for population structure on GEI statistics. We find that our approach effectively controls population structure on statistics for GEIs as well as for genetic variants

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

FigShare

Fifteen new risk loci for coronary artery disease highlight arterial-wall-specific mechanisms

Author: A Schröder
Abdulla al Shafi Majumder
Adam S Butterworth
Alex P Reiner
Anders Malarstig
Andrew D Johnson
Anne Justice
Anne Tybjærg-Hansen
AP Levy
AP Morris
AP Reiner
Arshed A Quyyumi
Asif Rasheed
AV Segrè
Benjamin B Sun
BF Voight
BL Harry
BP Fairfax
Børge G Nordestgaard
C Moore
Cara L Carty
Carl J Pepine
Chao A Hsiung
Charles Kooperberg
CJ Willer
D Qu
Daniel F Freitag
Daniel J Rader
Daniel R Barnes
Danish Saleheen
Devin Absher
Dewan S Alam
Dirk S Paul
DM Greenawalt
E Grundberg
EE Schadt
Elias L Salfati
Emanuele Di Angelantonio
Eric B Fauman
Eric Boerwinkle
F Innocenti
GA Roth
GR Abecasis
H Kirsten
H Lin
H Schunkert
Heribert Schunkert
HJ Westra
Hugh Watkins
I Holme
I-Te Lee
J Chen
J Dennis
J Erdmann
J Ernst
J Ernst
J Yang
Jeanette Erdmann
Jemma B Wilk
Jerome I Rotter
Joanna M M Howson
John A Spertus
John D Eicher
John Danesh
JR Privratsky
JR Staley
Julie A Johnson
Jyh-Ming J Juang
Kari E North
Katrine L Rasmussen
Kent D Taylor
Kristin Young
LD Ward
Lindsay L Waite
LM Boettger
Lucia A Hindorff
M Arnold
M Narahara
M Uhlen
M Uhlén
Mariaelisa Graff
N Franceschini
Nilesh J Samani
NJ Samani
NL Smith
Nora Franceschini
O Franzén
P Surendran
P Zanoni
Panos Deloukas
Philippe Frossard
Pia R Kamstrup
Praveen Surendran
R Goel
Rajiv Chowdhury
Ren-Hua Chung
Robin Young
Ron Do
S Purcell
Sekar Kathiresan
Stanley L Hazen
Steven Buyske
Sune F Nielsen
T Lappalainen
T Zeller
Themistocles L Assimes
Thomas Quertermous
TL Assimes
TM Teslovich
Tzung-Dau Wang
Ulrike Peters
V Nanda
W Tang
Wayne H H Sheu
Weang-Kee Ho
Wei Zhao
Wei-Yu Lin
Wen-Jane Lee
WJ Astle
X Zhang
X Zhou
Xiuqing Guo
Yi-Jen Hung
Yii-Der Ida Chen
Ying-Hsiang Chen
Z Wang
Å Johansson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Coronary artery disease (CAD) is a leading cause of morbidity and mortality worldwide. Although 58 genomic regions have been associated with CAD thus far, most of the heritability is unexplained, indicating that additional susceptibility loci await identification. An efficient discovery strategy may be larger-scale evaluation of promising associations suggested by genome-wide association studies (GWAS). Hence, we genotyped 56,309 participants using a targeted gene array derived from earlier GWAS results and performed meta-analysis of results with 194,427 participants previously genotyped, totaling 88,192 CAD cases and 162,544 controls. We identified 25 new SNP-CAD associations (P < 5 × 10(-8), in fixed-effects meta-analysis) from 15 genomic regions, including SNPs in or near genes involved in cellular adhesion, leukocyte migration and atherosclerosis (PECAM1, rs1867624), coagulation and inflammation (PROCR, rs867186 (p.Ser219Gly)) and vascular smooth muscle cell differentiation (LMOD1, rs2820315). Correlation of these regions with cell-type-specific gene expression and plasma protein levels sheds light on potential disease mechanisms

Crossref

Copenhagen University Research Information System

Carolina Digital Repository

eScholarship - University of California

Enlighten

Meta-analysis of exome array data identifies six novel genetic loci for lung function [version 3; referees: 2 approved]

Author: et al
Kraja Aldi
Province Michael A
Publication venue: Digital Commons@Becker
Publication date: 01/01/2018
Field of study

Digital Commons@Becker

Gene-based genome-wide association studies and meta-analyses of conotruncal heart defects.

Author: Agopian AJ
Goldmuntz Elizabeth
Hakonarson Hakon
Mitchell Laura E
Morrow Bernice E
Pediatric Cardiac Genomics Consortium
Sewda Anshuman
Taylor Deanne
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Conotruncal heart defects (CTDs) are among the most common and severe groups of congenital heart defects. Despite evidence of an inherited genetic contribution to CTDs, little is known about the specific genes that contribute to the development of CTDs. We performed gene-based genome-wide analyses using microarray-genotyped and imputed common and rare variants data from two large studies of CTDs in the United States. We performed two case-parent trio analyses (N = 640 and 317 trios), using an extension of the family-based multi-marker association test, and two case-control analyses (N = 482 and 406 patients and comparable numbers of controls), using a sequence kernel association test. We also undertook two meta-analyses to combine the results from the analyses that used the same approach (i.e. family-based or case-control). To our knowledge, these analyses are the first reported gene-based, genome-wide association studies of CTDs. Based on our findings, we propose eight CTD candidate genes (ARF5, EIF4E, KPNA1, MAP4K3, MBNL1, NCAPG, NDFUS1 and PSMG3). Four of these genes (ARF5, KPNA1, NDUFS1 and PSMG3) have not been previously associated with normal or abnormal heart development. In addition, our analyses provide additional evidence that genes involved in chromatin-modification and in ribonucleic acid splicing are associated with congenital heart defects

Directory of Open Access Journals

eScholarship - University of California