Search CORE

Harvard University - DASH

eScholarship - University of California

UCL Discovery

University of Miami: Scholarship Miami

George Washington University: Health Sciences Research Commons (HSRC)

FigShare

Accurate HLA type inference using a weighted similarity graph

Author: A Gusev
AJ Monsuur
C Vandiedonck
DE Goldberg
J Li
J Xiao
Jing Li
JM Barker
L Handunnetthi
L Koskinen
Minzhu Xie
MN Setty
PIW de Bakker
S Leslie
T Shiina
Tao Jiang
V Bansal
X Li
Publication venue: BioMed Central
Publication date: 14/12/2010
Field of study

Abstract Background The human leukocyte antigen system (HLA) contains many highly variable genes. HLA genes play an important role in the human immune system, and HLA gene matching is crucial for the success of human organ transplantations. Numerous studies have demonstrated that variation in HLA genes is associated with many autoimmune, inflammatory and infectious diseases. However, typing HLA genes by serology or PCR is time consuming and expensive, which limits large-scale studies involving HLA genes. Since it is much easier and cheaper to obtain single nucleotide polymorphism (SNP) genotype data, accurate computational algorithms to infer HLA gene types from SNP genotype data are in need. To infer HLA types from SNP genotypes, the first step is to infer SNP haplotypes from genotypes. However, for the same SNP genotype data set, the haplotype configurations inferred by different methods are usually inconsistent, and it is often difficult to decide which one is true. Results In this paper, we design an accurate HLA gene type inference algorithm by utilizing SNP genotype data from pedigrees, known HLA gene types of some individuals and the relationship between inferred SNP haplotypes and HLA gene types. Given a set of haplotypes inferred from the genotypes of a population consisting of many pedigrees, the algorithm first constructs a weighted similarity graph based on a new haplotype similarity measure and derives constraint edges from known HLA gene types. Based on the principle that different HLA gene alleles should have different background haplotypes, the algorithm searches for an optimal labeling of all the haplotypes with unknown HLA gene types such that the total weight among the same HLA gene types is maximized. To deal with ambiguous haplotype solutions, we use a genetic algorithm to select haplotype configurations that tend to maximize the same optimization criterion. Our experiments on a previously typed subset of the HapMap data show that the algorithm is highly accurate, achieving an accuracy of 96% for gene HLA-A, 95% for HLA-B, 97% for HLA-C, 84% for HLA-DRB1, 98% for HLA-DQA1 and 97% for HLA-DQB1 in a leave-one-out test. Conclusions Our algorithm can infer HLA gene types from neighboring SNP genotype data accurately. Compared with a recent approach on the same input data, our algorithm achieved a higher accuracy. The code of our algorithm is available to the public for free upon request to the corresponding authors

Springer - Publisher Connector

eScholarship - University of California

Genetic risk factors for ischaemic stroke and its subtypes (the METASTROKE Collaboration): a meta-analysis of genome-wide association studies

Author: Abboud S
Achterberg S
Algra A
Algra A
Benn M
Berger K
Bevan S
Bis JC
Boncoraglio GB
Carty C
Chen WM
Chen WM
Cheng YC
Clarke R
Davies G
de Bakker PIW
de Bakker PIW
de Bakker PIW
Deary I
Delavaran H
DeStefano AL
Doney ASF
Farrall M
Farrall M
Fernandez-Cadenas I
Ferro JM
Fornage M
Furie K
Gretarsdottir S
Gschwendtner A
Helgadottir A
Helgadottir A
Helgadottir A
Higgins P
Ho WK
Hofman A
Hofman A
Holliday EG
Hopewell JC
Ikram MA
Ikram MA
Ikram MA
Khan MS
Kittner SJ
Kittner SJ
Kostulas K
Kuhlenbäumer G
Lemmens R
Lemmens R
Lemmens R
Levi C
Lindgren A
Longstreth WT
Longstreth WT
Malik R
Mitchell BD
Montaner J
Mosley TH
Nalls MA
Nordestgaard BG
Nordestgaard BG
Norrving B
O'Donnell M
Oliveira SA
Palmer CNA
Pandolfo M
Parati EA
Paré G
Pera J
Psaty BM
Reiner AP
Ringelstein EB
Rosand J
Rothwell PM
Sale M
Sale M
Saleheen D
Saleheen D
Saleheen D
Schmidt H
Schmidt R
Seshadri S
Sharma P
Slowik A
Sudlow C
Thijs V
Thijs V
Thijs V
Thorleifsson G
Thorsteinsdottir U
Thorsteinsdottir U
Traylor M
Valdimarsson E
van Zuydam NR
Vicente AM
Walters M
Wiggins KL
Worrall BB
Worrall BB
Yadav S
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Background - Various genome-wide association studies (GWAS) have been done in ischaemic stroke, identifying a few loci associated with the disease, but sample sizes have been 3500 cases or less. We established the METASTROKE collaboration with the aim of validating associations from previous GWAS and identifying novel genetic associations through meta-analysis of GWAS datasets for ischaemic stroke and its subtypes. Methods - We meta-analysed data from 15 ischaemic stroke cohorts with a total of 12 389 individuals with ischaemic stroke and 62 004 controls, all of European ancestry. For the associations reaching genome-wide significance in METASTROKE, we did a further analysis, conditioning on the lead single nucleotide polymorphism in every associated region. Replication of novel suggestive signals was done in 13 347 cases and 29 083 controls. Findings - We verified previous associations for cardioembolic stroke near PITX2 (p=2·8×10−16) and ZFHX3 (p=2·28×10−8), and for large-vessel stroke at a 9p21 locus (p=3·32×10−5) and HDAC9 (p=2·03×10−12). Additionally, we verified that all associations were subtype specific. Conditional analysis in the three regions for which the associations reached genome-wide significance (PITX2, ZFHX3, and HDAC9) indicated that all the signal in each region could be attributed to one risk haplotype. We also identified 12 potentially novel loci at p<5×10−6. However, we were unable to replicate any of these novel associations in the replication cohort. Interpretation - Our results show that, although genetic variants can be detected in patients with ischaemic stroke when compared with controls, all associations we were able to confirm are specific to a stroke subtype. This finding has two implications. First, to maximise success of genetic studies in ischaemic stroke, detailed stroke subtyping is required. Second, different genetic pathophysiological mechanisms seem to be associated with different stroke subtypes.</p&gt

ResearchOnline at James Cook University

Edinburgh Research Explorer

Leiden University Scholary Publications

Enlighten

Erasmus University Digital Repository

Access to Research at National University of Ireland, Galway

University of Newcastle's Digital Repository

Elsevier - Publisher Connector

Lund University Publications

Copenhagen University Research Information System

EUR Research Repository

Oxford University Research Archive

University of Dundee Online Publications

St George's Online Research Archive

Repositório Científico do Instituto Nacional de Saúde

A fast algorithm for genome-wide haplotype pattern mining

Author: AP Morris
Christian NS Pedersen
DE Arking
DJ Smyth
F Larribe
HT Toivonen
HTT Toivonen
I Pe'er
J Gudmundsson
J Gudmundsson
J Li
J Molitor
JS Liu
LT Amundadottir
MJ Minichiello
PIW de Bakker
R Saxena
S Zöllner
SR Browning
SR Browning
Søren Besenbacher
T Mailund
Thomas Mailund
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Identifying the genetic components of common diseases has long been an important area of research. Recently, genotyping technology has reached the level where it is cost effective to genotype single nucleotide polymorphism (SNP) markers covering the entire genome, in thousands of individuals, and analyse such data for markers associated with a diseases. The statistical power to detect association, however, is limited when markers are analysed one at a time. This can be alleviated by considering multiple markers simultaneously. The <it>Haplotype Pattern Mining </it>(HPM) method is a machine learning approach to do exactly this. Results We present a new, faster algorithm for the HPM method. The new approach use patterns of haplotype diversity in the genome: locally in the genome, the number of observed haplotypes is much smaller than the total number of possible haplotypes. We show that the new approach speeds up the HPM method with a factor of 2 on a genome-wide dataset with 5009 individuals typed in 491208 markers using default parameters and more if the pattern length is increased. Conclusion The new algorithm speeds up the HPM method and we show that it is feasible to apply HPM to whole genome association mapping with thousands of individuals and hundreds of thousands of markers.</p

Springer - Publisher Connector

Public Library of Science (PLOS)

Absence of Evidence for MHC–Dependent Mate Selection within HapMap Populations

Author: Adnan Derti
AJ Hayter
C Ober
C Wedekind
Can Cenik
CE Garver-Apgar
Frederick P. Roth
J Havlicek
J Marchini
Molly Przeworski
Peter Kraft
PIW de Bakker
R Chaix
R Nuzzo
RN Thompson
RR Sokal
S Jacob
SC Roberts
SC Roberts
T Rülicke
TD Wyatt
WK Potts
Publication venue: Public Library of Science
Publication date: 01/04/2010
Field of study

The major histocompatibility complex (MHC) of immunity genes has been reported to influence mate choice in vertebrates, and a recent study presented genetic evidence for this effect in humans. Specifically, greater dissimilarity at the MHC locus was reported for European-American mates (parents in HapMap Phase 2 trios) than for non-mates. Here we show that the results depend on a few extreme data points, are not robust to conservative changes in the analysis procedure, and cannot be reproduced in an equivalent but independent set of European-American mates. Although some evidence suggests an avoidance of extreme MHC similarity between mates, rather than a preference for dissimilarity, limited sample sizes preclude a rigorous investigation. In summary, fine-scale molecular-genetic data do not conclusively support the hypothesis that mate selection in humans is influenced by the MHC locus

Twenty-eight genetic loci associated with ST-T-wave amplitudes of the electrocardiogram

Author: Alonso A
Arking DE
Barnett P
Bis JC
Boyer LA
de Bakker PIW
de Boer RA
Duijn Cornelia
Eijgelsheim Mark
Franke L
Hillege HL
Hirschhorn JN
Isaacs Aaron
Kahonen M
Kors Jan
Leach IM
Lehtimaki T
Lyytikainen LP
Pers TH
Raitakari OT
Silva Aldana Claudia
Soliman EZ
Sotoodehnia N
van den Berg Marten
van der Harst P
van Gilst WH
Veldhuisen DJ
Verweij N
Wang X C
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2016
Field of study

EUR Research Repository

The Impact of Imputation on Meta-Analysis of Genome-Wide Association Studies

Author: CA Anderson
DH Xiong
DK Sanghera
E Evangelou
E Zeggini
E Zeggini
FK Kavvoura
H Staiger
Hong-Wen Deng
J Marchini
JD Cooper
Jian Li
MAR Ferreira
MI McCarthy
MI McCarthy
MM Iles
Momiao Xiong
MX Li
PIW de Bakker
RJ Xavier
RS Houlston
S Raychaudhuri
W Cochran
Yan-fang Guo
YF Pei
Yufang Pei
Z Su
ZM Zhao
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Genotype imputation is often used in the meta-analysis of genome-wide association studies (GWAS), for combining data from different studies and/or genotyping platforms, in order to improve the ability for detecting disease variants with small to moderate effects. However, how genotype imputation affects the performance of the meta-analysis of GWAS is largely unknown. In this study, we investigated the effects of genotype imputation on the performance of meta-analysis through simulations based on empirical data from the Framingham Heart Study. We found that when fix-effects models were used, considerable between-study heterogeneity was detected when causal variants were typed in only some but not all individual studies, resulting in up to ∼25% reduction of detection power. For certain situations, the power of the meta-analysis can be even less than that of individual studies. Additional analyses showed that the detection power was slightly improved when between-study heterogeneity was partially controlled through the random-effects model, relative to that of the fixed-effects model. Our study may aid in the planning, data analysis, and interpretation of GWAS meta-analysis results when genotype imputation is necessary

CiteSeerX

Genetic Susceptibility Loci for Cardiovascular Disease and Their Impact on Atherosclerotic Plaques

Author: Asl HF
Asselbergs FW
Bjorkegrenn JLM
de Bakker PIW
de Borst GJ
den Ruijter HM
Dichgans M
Erdmann J
Haitjema S
Hedin U
Malik R
Mokry M
Pasterkamp G
Paulsson-Berne G
Perisic L
Samani NJ
Schunkert H
Siemelink MA
van der Laan SW
van Setten J
Worrall BB
Publication venue: 'Ovid Technologies (Wolters Kluwer Health)'
Publication date: 01/09/2018
Field of study

BACKGROUND: Atherosclerosis is a chronic inflammatory disease in part caused by lipid uptake in the vascular wall, but the exact underlying mechanisms leading to acute myocardial infarction and stroke remain poorly understood. Large consortia identified genetic susceptibility loci that associate with large artery ischemic stroke and coronary artery disease. However, deciphering their underlying mechanisms are challenging. Histological studies identified destabilizing characteristics in human atherosclerotic plaques that associate with clinical outcome. To what extent established susceptibility loci for large artery ischemic stroke and coronary artery disease relate to plaque characteristics is thus far unknown but may point to novel mechanisms. METHODS: We studied the associations of 61 established cardiovascular risk loci with 7 histological plaque characteristics assessed in 1443 carotid plaque specimens from the Athero-Express Biobank Study. We also assessed if the genotyped cardiovascular risk loci impact the tissue-specific gene expression in 2 independent biobanks, Biobank of Karolinska Endarterectomy and Stockholm Atherosclerosis Gene Expression. RESULTS: A total of 21 established risk variants (out of 61) nominally associated to a plaque characteristic. One variant (rs12539895, risk allele A) at 7q22 associated to a reduction of intraplaque fat, P=5.09×10−6 after correction for multiple testing. We further characterized this 7q22 Locus and show tissue-specific effects of rs12539895 on HBP1 expression in plaques and COG5 expression in whole blood and provide data from public resources showing an association with decreased LDL (low-density lipoprotein) and increase HDL (high-density lipoprotein) in the blood. CONCLUSIONS: Our study supports the view that cardiovascular susceptibility loci may exert their effect by influencing the atherosclerotic plaque characteristics

UCL Discovery

Haplotype-based quantitative trait mapping using a clustering algorithm

Author: AD Long
C Durrant
DB Allison
DJ Sheskin
GA Churchill
HT Toivonen
HW Deng
HW Deng
J Li
J Li
J Molitor
Jing Li
JM Comeron
JS Liu
JY Tzeng
K Song
K Zhang
L Kruglyak
M Ester
M Lynch
M Stephens
MJ Daly
MS McPeek
PIW de Bakker
R Fan
Robert C Elston
RR Hudson
S Zollner
SB Gabriel
T Niu
The International HapMap Consortium
The International HapMap Consortium
Yingyao Zhou
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: With the availability of large-scale, high-density single-nucleotide polymorphism (SNP) markers, substantial effort has been made in identifying disease-causing genes using linkage disequilibrium (LD) mapping by haplotype analysis of unrelated individuals. In addition to complex diseases, many continuously distributed quantitative traits are of primary clinical and health significance. However the development of association mapping methods using unrelated individuals for quantitative traits has received relatively less attention. RESULTS: We recently developed an association mapping method for complex diseases by mining the sharing of haplotype segments (i.e., phased genotype pairs) in affected individuals that are rarely present in normal individuals. In this paper, we extend our previous work to address the problem of quantitative trait mapping from unrelated individuals. The method is non-parametric in nature, and statistical significance can be obtained by a permutation test. It can also be incorporated into the one-way ANCOVA (analysis of covariance) framework so that other factors and covariates can be easily incorporated. The effectiveness of the approach is demonstrated by extensive experimental studies using both simulated and real data sets. The results show that our haplotype-based approach is more robust than two statistical methods based on single markers: a single SNP association test (SSA) and the Mann-Whitney U-test (MWU). The algorithm has been incorporated into our existing software package called HapMiner, which is available from our website at . CONCLUSION: For QTL (quantitative trait loci) fine mapping, to identify QTNs (quantitative trait nucleotides) with realistic effects (the contribution of each QTN less than 10% of total variance of the trait), large samples sizes (≥ 500) are needed for all the methods. The overall performance of HapMiner is better than that of the other two methods. Its effectiveness further depends on other factors such as recombination rates and the density of typed SNPs. Haplotype-based methods might provide higher power than methods based on a single SNP when using tag SNPs selected from a small number of samples or some other sources (such as HapMap data). Rank-based statistics usually have much lower power, as shown in our study

Springer - Publisher Connector