Search CORE

Harvard University - DASH

eScholarship - University of California

UCL Discovery

University of Miami: Scholarship Miami

George Washington University: Health Sciences Research Commons (HSRC)

The Francis Crick Institute

Local Genealogies in a Linear Mixed Model for Genome-Wide Association Mapping in Complex Pedigreed Populations

Author: Bernt Guldbrandtsen
Goutam Sahana
J Akey
JM Yu
JPA Ioannidis
L Crooks
Mogens Sandø Lund
P Scheet
PIW de Bakker
S Besenbacher
T Mailund
Thomas Mailund
Y Liu
ZH Ding
Zhongming Zhao
Publication venue: Public Library of Science
Publication date: 02/11/2011
Field of study

INTRODUCTION: The state-of-the-art for dealing with multiple levels of relationship among the samples in genome-wide association studies (GWAS) is unified mixed model analysis (MMA). This approach is very flexible, can be applied to both family-based and population-based samples, and can be extended to incorporate other effects in a straightforward and rigorous fashion. Here, we present a complementary approach, called 'GENMIX (genealogy based mixed model)' which combines advantages from two powerful GWAS methods: genealogy-based haplotype grouping and MMA. SUBJECTS AND METHODS: We validated GENMIX using genotyping data of Danish Jersey cattle and simulated phenotype and compared to the MMA. We simulated scenarios for three levels of heritability (0.21, 0.34, and 0.64), seven levels of MAF (0.05, 0.10, 0.15, 0.20, 0.25, 0.35, and 0.45) and five levels of QTL effect (0.1, 0.2, 0.5, 0.7 and 1.0 in phenotypic standard deviation unit). Each of these 105 possible combinations (3 h(2) x 7 MAF x 5 effects) of scenarios was replicated 25 times. RESULTS: GENMIX provides a better ranking of markers close to the causative locus' location. GENMIX outperformed MMA when the QTL effect was small and the MAF at the QTL was low. In scenarios where MAF was high or the QTL affecting the trait had a large effect both GENMIX and MMA performed similarly. CONCLUSION: In discovery studies, where high-ranking markers are identified and later examined in validation studies, we therefore expect GENMIX to enrich candidates brought to follow-up studies with true positives over false positives more than the MMA would

Public Library of Science (PLOS)

Accurate HLA type inference using a weighted similarity graph

Author: A Gusev
AJ Monsuur
C Vandiedonck
DE Goldberg
J Li
J Xiao
Jing Li
JM Barker
L Handunnetthi
L Koskinen
Minzhu Xie
MN Setty
PIW de Bakker
S Leslie
T Shiina
Tao Jiang
V Bansal
X Li
Publication venue: BioMed Central
Publication date: 14/12/2010
Field of study

Abstract Background The human leukocyte antigen system (HLA) contains many highly variable genes. HLA genes play an important role in the human immune system, and HLA gene matching is crucial for the success of human organ transplantations. Numerous studies have demonstrated that variation in HLA genes is associated with many autoimmune, inflammatory and infectious diseases. However, typing HLA genes by serology or PCR is time consuming and expensive, which limits large-scale studies involving HLA genes. Since it is much easier and cheaper to obtain single nucleotide polymorphism (SNP) genotype data, accurate computational algorithms to infer HLA gene types from SNP genotype data are in need. To infer HLA types from SNP genotypes, the first step is to infer SNP haplotypes from genotypes. However, for the same SNP genotype data set, the haplotype configurations inferred by different methods are usually inconsistent, and it is often difficult to decide which one is true. Results In this paper, we design an accurate HLA gene type inference algorithm by utilizing SNP genotype data from pedigrees, known HLA gene types of some individuals and the relationship between inferred SNP haplotypes and HLA gene types. Given a set of haplotypes inferred from the genotypes of a population consisting of many pedigrees, the algorithm first constructs a weighted similarity graph based on a new haplotype similarity measure and derives constraint edges from known HLA gene types. Based on the principle that different HLA gene alleles should have different background haplotypes, the algorithm searches for an optimal labeling of all the haplotypes with unknown HLA gene types such that the total weight among the same HLA gene types is maximized. To deal with ambiguous haplotype solutions, we use a genetic algorithm to select haplotype configurations that tend to maximize the same optimization criterion. Our experiments on a previously typed subset of the HapMap data show that the algorithm is highly accurate, achieving an accuracy of 96% for gene HLA-A, 95% for HLA-B, 97% for HLA-C, 84% for HLA-DRB1, 98% for HLA-DQA1 and 97% for HLA-DQB1 in a leave-one-out test. Conclusions Our algorithm can infer HLA gene types from neighboring SNP genotype data accurately. Compared with a recent approach on the same input data, our algorithm achieved a higher accuracy. The code of our algorithm is available to the public for free upon request to the corresponding authors

Springer - Publisher Connector

eScholarship - University of California

Genetic risk factors for ischaemic stroke and its subtypes (the METASTROKE Collaboration): a meta-analysis of genome-wide association studies

Author: Abboud S
Achterberg S
Algra A
Algra A
Benn M
Berger K
Bevan S
Bis JC
Boncoraglio GB
Carty C
Chen WM
Chen WM
Cheng YC
Clarke R
Davies G
de Bakker PIW
de Bakker PIW
de Bakker PIW
Deary I
Delavaran H
DeStefano AL
Doney ASF
Farrall M
Farrall M
Fernandez-Cadenas I
Ferro JM
Fornage M
Furie K
Gretarsdottir S
Gschwendtner A
Helgadottir A
Helgadottir A
Helgadottir A
Higgins P
Ho WK
Hofman A
Hofman A
Holliday EG
Hopewell JC
Ikram MA
Ikram MA
Ikram MA
Khan MS
Kittner SJ
Kittner SJ
Kostulas K
Kuhlenbäumer G
Lemmens R
Lemmens R
Lemmens R
Levi C
Lindgren A
Longstreth WT
Longstreth WT
Malik R
Mitchell BD
Montaner J
Mosley TH
Nalls MA
Nordestgaard BG
Nordestgaard BG
Norrving B
O'Donnell M
Oliveira SA
Palmer CNA
Pandolfo M
Parati EA
Paré G
Pera J
Psaty BM
Reiner AP
Ringelstein EB
Rosand J
Rothwell PM
Sale M
Sale M
Saleheen D
Saleheen D
Saleheen D
Schmidt H
Schmidt R
Seshadri S
Sharma P
Slowik A
Sudlow C
Thijs V
Thijs V
Thijs V
Thorleifsson G
Thorsteinsdottir U
Thorsteinsdottir U
Traylor M
Valdimarsson E
van Zuydam NR
Vicente AM
Walters M
Wiggins KL
Worrall BB
Worrall BB
Yadav S
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Background - Various genome-wide association studies (GWAS) have been done in ischaemic stroke, identifying a few loci associated with the disease, but sample sizes have been 3500 cases or less. We established the METASTROKE collaboration with the aim of validating associations from previous GWAS and identifying novel genetic associations through meta-analysis of GWAS datasets for ischaemic stroke and its subtypes. Methods - We meta-analysed data from 15 ischaemic stroke cohorts with a total of 12 389 individuals with ischaemic stroke and 62 004 controls, all of European ancestry. For the associations reaching genome-wide significance in METASTROKE, we did a further analysis, conditioning on the lead single nucleotide polymorphism in every associated region. Replication of novel suggestive signals was done in 13 347 cases and 29 083 controls. Findings - We verified previous associations for cardioembolic stroke near PITX2 (p=2·8×10−16) and ZFHX3 (p=2·28×10−8), and for large-vessel stroke at a 9p21 locus (p=3·32×10−5) and HDAC9 (p=2·03×10−12). Additionally, we verified that all associations were subtype specific. Conditional analysis in the three regions for which the associations reached genome-wide significance (PITX2, ZFHX3, and HDAC9) indicated that all the signal in each region could be attributed to one risk haplotype. We also identified 12 potentially novel loci at p<5×10−6. However, we were unable to replicate any of these novel associations in the replication cohort. Interpretation - Our results show that, although genetic variants can be detected in patients with ischaemic stroke when compared with controls, all associations we were able to confirm are specific to a stroke subtype. This finding has two implications. First, to maximise success of genetic studies in ischaemic stroke, detailed stroke subtyping is required. Second, different genetic pathophysiological mechanisms seem to be associated with different stroke subtypes.</p&gt

ResearchOnline at James Cook University

Edinburgh Research Explorer

Leiden University Scholary Publications

Enlighten

Erasmus University Digital Repository

Access to Research at National University of Ireland, Galway

University of Newcastle's Digital Repository

ResearchOnline@JCU

Elsevier - Publisher Connector

Lund University Publications

Copenhagen University Research Information System

EUR Research Repository

Oxford University Research Archive

University of Dundee Online Publications

St George's Online Research Archive

Repositório Científico do Instituto Nacional de Saúde

A fast algorithm for genome-wide haplotype pattern mining

Author: AP Morris
Christian NS Pedersen
DE Arking
DJ Smyth
F Larribe
HT Toivonen
HTT Toivonen
I Pe'er
J Gudmundsson
J Gudmundsson
J Li
J Molitor
JS Liu
LT Amundadottir
MJ Minichiello
PIW de Bakker
R Saxena
S Zöllner
SR Browning
SR Browning
Søren Besenbacher
T Mailund
Thomas Mailund
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Identifying the genetic components of common diseases has long been an important area of research. Recently, genotyping technology has reached the level where it is cost effective to genotype single nucleotide polymorphism (SNP) markers covering the entire genome, in thousands of individuals, and analyse such data for markers associated with a diseases. The statistical power to detect association, however, is limited when markers are analysed one at a time. This can be alleviated by considering multiple markers simultaneously. The <it>Haplotype Pattern Mining </it>(HPM) method is a machine learning approach to do exactly this. Results We present a new, faster algorithm for the HPM method. The new approach use patterns of haplotype diversity in the genome: locally in the genome, the number of observed haplotypes is much smaller than the total number of possible haplotypes. We show that the new approach speeds up the HPM method with a factor of 2 on a genome-wide dataset with 5009 individuals typed in 491208 markers using default parameters and more if the pattern length is increased. Conclusion The new algorithm speeds up the HPM method and we show that it is feasible to apply HPM to whole genome association mapping with thousands of individuals and hundreds of thousands of markers.</p

Springer - Publisher Connector

Public Library of Science (PLOS)

Absence of Evidence for MHC–Dependent Mate Selection within HapMap Populations

Author: Adnan Derti
AJ Hayter
C Ober
C Wedekind
Can Cenik
CE Garver-Apgar
Frederick P. Roth
J Havlicek
J Marchini
Molly Przeworski
Peter Kraft
PIW de Bakker
R Chaix
R Nuzzo
RN Thompson
RR Sokal
S Jacob
SC Roberts
SC Roberts
T Rülicke
TD Wyatt
WK Potts
Publication venue: Public Library of Science
Publication date: 01/04/2010
Field of study

The major histocompatibility complex (MHC) of immunity genes has been reported to influence mate choice in vertebrates, and a recent study presented genetic evidence for this effect in humans. Specifically, greater dissimilarity at the MHC locus was reported for European-American mates (parents in HapMap Phase 2 trios) than for non-mates. Here we show that the results depend on a few extreme data points, are not robust to conservative changes in the analysis procedure, and cannot be reproduced in an equivalent but independent set of European-American mates. Although some evidence suggests an avoidance of extreme MHC similarity between mates, rather than a preference for dissimilarity, limited sample sizes preclude a rigorous investigation. In summary, fine-scale molecular-genetic data do not conclusively support the hypothesis that mate selection in humans is influenced by the MHC locus

Twenty-eight genetic loci associated with ST-T-wave amplitudes of the electrocardiogram

Author: Alonso A
Arking DE
Barnett P
Bis JC
Boyer LA
de Bakker PIW
de Boer RA
Duijn Cornelia
Eijgelsheim Mark
Franke L
Hillege HL
Hirschhorn JN
Isaacs Aaron
Kahonen M
Kors Jan
Leach IM
Lehtimaki T
Lyytikainen LP
Pers TH
Raitakari OT
Silva Aldana Claudia
Soliman EZ
Sotoodehnia N
van den Berg Marten
van der Harst P
van Gilst WH
Veldhuisen DJ
Verweij N
Wang X C
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2016
Field of study

EUR Research Repository

The Impact of Imputation on Meta-Analysis of Genome-Wide Association Studies

Author: CA Anderson
DH Xiong
DK Sanghera
E Evangelou
E Zeggini
E Zeggini
FK Kavvoura
H Staiger
Hong-Wen Deng
J Marchini
JD Cooper
Jian Li
MAR Ferreira
MI McCarthy
MI McCarthy
MM Iles
Momiao Xiong
MX Li
PIW de Bakker
RJ Xavier
RS Houlston
S Raychaudhuri
W Cochran
Yan-fang Guo
YF Pei
Yufang Pei
Z Su
ZM Zhao
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Genotype imputation is often used in the meta-analysis of genome-wide association studies (GWAS), for combining data from different studies and/or genotyping platforms, in order to improve the ability for detecting disease variants with small to moderate effects. However, how genotype imputation affects the performance of the meta-analysis of GWAS is largely unknown. In this study, we investigated the effects of genotype imputation on the performance of meta-analysis through simulations based on empirical data from the Framingham Heart Study. We found that when fix-effects models were used, considerable between-study heterogeneity was detected when causal variants were typed in only some but not all individual studies, resulting in up to ∼25% reduction of detection power. For certain situations, the power of the meta-analysis can be even less than that of individual studies. Additional analyses showed that the detection power was slightly improved when between-study heterogeneity was partially controlled through the random-effects model, relative to that of the fixed-effects model. Our study may aid in the planning, data analysis, and interpretation of GWAS meta-analysis results when genotype imputation is necessary

CiteSeerX

Development and application of genomic control methods for genome-wide association studies using non-additive models

Author: A Bittles
AL Price
B Devlin
Cornelia M. van Duijn
D Kobayashi
F Liu
G Zheng
G Zheng
H-E Wichmann
Harald Grallert
J Dupuis
J Liu
J Yu
Janina S. Ried
JK Pritchard
Konstantin Strauch
Lin Chen
M Kolz
P Gorroochurn
PA Oliehoek
PIW De Bakker
PIW De Bakker
PJ McLaren
S-A Bacanu
T Dadd
T Yan
Tatiana I. Axenovich
W-M Chen
Y Zang
Yakov A. Tsepilov
YS Aulchenko
YS Aulchenko
YS Aulchenko
Yurii S. Aulchenko
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Genome-wide association studies (GWAS) comprise a powerful tool for mapping genes of complex traits. However, an inflation of the test statistic can occur because of population substructure or cryptic relatedness, which could cause spurious associations. If information on a large number of genetic markers is available, adjusting the analysis results by using the method of genomic control (GC) is possible. GC was originally proposed to correct the Cochran-Armitage additive trend test. For non-additive models, correction has been shown to depend on allele frequencies. Therefore, usage of GC is limited to situations where allele frequencies of null markers and candidate markers are matched. In this work, we extended the capabilities of the GC method for non-additive models, which allows us to use null markers with arbitrary allele frequencies for GC. Analytical expressions for the inflation of a test statistic describing its dependency on allele frequency and several population parameters were obtained for recessive, dominant, and over-dominant models of inheritance. We proposed a method to estimate these required population parameters. Furthermore, we suggested a GC method based on approximation of the correction coefficient by a polynomial of allele frequency and described procedures to correct the genotypic (two degrees of freedom) test for cases when the model of inheritance is unknown. Statistical properties of the described methods were investigated using simulated and real data. We demonstrated that all considered methods were effective in controlling type 1 error in the presence of genetic substructure. The proposed GC methods can be applied to statistical tests for GWAS with various models of inheritance. All methods developed and tested in this work were implemented using R language as a part of the GenABEL package