Search CORE

NSU Works

Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets

Author: AW Kung
B Han
BN Howie
CA Anderson
DE Reich
DR Nyholt
DY Lin
E Lander
F Dudbridge
G Montana
I Pe’er
I Pe’er
J Li
J Ragoussis
JC Barrett
JM Cheverud
Juilian M. Y. Yeung
K Hao
KA Frazer
Miao-Xin Li
ML Metzker
MX Li
NW Galwey
P Duggal
Pak C. Sham
R Pahl
S Purcell
SA Tishkoff
SR Seaman
Stacey S. Cherny
V Moskvina
WG Hill
X Gao
Publication venue: Springer-Verlag
Publication date: 01/01/2012
Field of study

Current genome-wide association studies (GWAS) use commercial genotyping microarrays that can assay over a million single nucleotide polymorphisms (SNPs). The number of SNPs is further boosted by advanced statistical genotype-imputation algorithms and large SNP databases for reference human populations. The testing of a huge number of SNPs needs to be taken into account in the interpretation of statistical significance in such genome-wide studies, but this is complicated by the non-independence of SNPs because of linkage disequilibrium (LD). Several previous groups have proposed the use of the effective number of independent markers (Me) for the adjustment of multiple testing, but current methods of calculation for Me are limited in accuracy or computational speed. Here, we report a more robust and fast method to calculate Me. Applying this efficient method [implemented in a free software tool named Genetic type 1 error calculator (GEC)], we systematically examined the Me, and the corresponding p-value thresholds required to control the genome-wide type 1 error rate at 0.05, for 13 Illumina or Affymetrix genotyping arrays, as well as for HapMap Project and 1000 Genomes Project datasets which are widely used in genotype imputation as reference panels. Our results suggested the use of a p-value threshold of ~10−7 as the criterion for genome-wide significance for early commercial genotyping arrays, but slightly more stringent p-value thresholds ~5 × 10−8 for current or merged commercial genotyping arrays, ~10−8 for all common SNPs in the 1000 Genomes Project dataset and ~5 × 10−8 for the common SNPs only within genes

Springer - Publisher Connector

HKU Scholars Hub

Fine-scale detection of population-specific linkage disequilibrium using haplotype entropy in the human genome

Author: A Carvajal-Rodríguez
AF Reis
Alexei Vazquez
Arnold J Levine
BF Voight
BL Niell
D Gezen-Ak
DC Crawford
DE Reich
EJ Parra
G Ménasché
G Ribas
GS Atwal
Gurinder Atwal
Haijian Wang
Hideaki Mizuno
International HapMap Consortium
J Costas
J McGrath
J Zhang
JD Simmons
JK Pickrell
JM Valdivielso
KM Teshima
LE Matesic
M Kuningas
M Nothnagel
MV Rockman
NG Jablonski
PC Sabeti
PC Sabeti
PC Sabeti
R Bouillon
R Nielsen
S Myles
S Myles
SA Tishkoff
SH Williamson
T Nakajima
WP Walker
Y Picornell
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The creation of a coherent genomic map of recent selection is one of the greatest challenges towards a better understanding of human evolution and the identification of functional genetic variants. Several methods have been proposed to detect linkage disequilibrium (LD), which is indicative of natural selection, from genome-wide profiles of common genetic variations but are designed for large regions. Results To find population-specific LD within small regions, we have devised an entropy-based method that utilizes differences in haplotype frequency between populations. The method has the advantages of incorporating multilocus association, conciliation with low allele frequencies, and independence from allele polarity, which are ideal for short haplotype analysis. The comparison of HapMap SNPs data from African and Caucasian populations with a median resolution size of ~23 kb gave us novel candidates as well as known selection targets. Enrichment analysis for the yielded genes showed associations with diverse diseases such as cardiovascular, immunological, neurological, and skeletal and muscular diseases. A possible scenario for a selective force is discussed. In addition, we have developed a web interface (ENIGMA, available at <url>http://gibk21.bse.kyutech.ac.jp/ENIGMA/index.html</url>), which allows researchers to query their regions of interest for population-specific LD. Conclusion The haplotype entropy method is powerful for detecting population-specific LD embedded in short regions and should contribute to further studies aiming to decipher the evolutionary histories of modern humans.</p

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

Genome-Wide Association Studies of the PR Interval in African Americans

The PR interval on the electrocardiogram reflects atrial and atrioventricular nodal conduction time. The PR interval is heritable, provides important information about arrhythmia risk, and has been suggested to differ among human races. Genome-wide association (GWA) studies have identified common genetic determinants of the PR interval in individuals of European and Asian ancestry, but there is a general paucity of GWA studies in individuals of African ancestry. We performed GWA studies in African American individuals from four cohorts (n = 6,247) to identify genetic variants associated with PR interval duration. Genotyping was performed using the Affymetrix 6.0 microarray. Imputation was performed for 2.8 million single nucleotide polymorphisms (SNPs) using combined YRI and CEU HapMap phase II panels. We observed a strong signal (rs3922844) within the gene encoding the cardiac sodium channel (SCN5A) with genome-wide significant association (p<2.5×10−8) in two of the four cohorts and in the meta-analysis. The signal explained 2% of PR interval variability in African Americans (beta = 5.1 msec per minor allele, 95% CI = 4.1–6.1, p = 3×10−23). This SNP was also associated with PR interval (beta = 2.4 msec per minor allele, 95% CI = 1.8–3.0, p = 3×10−16) in individuals of European ancestry (n = 14,042), but with a smaller effect size (p for heterogeneity <0.001) and variability explained (0.5%). Further meta-analysis of the four cohorts identified genome-wide significant associations with SNPs in SCN10A (rs6798015), MEIS1 (rs10865355), and TBX5 (rs7312625) that were highly correlated with SNPs identified in European and Asian GWA studies. African ancestry was associated with increased PR duration (13.3 msec, p = 0.009) in one but not the other three cohorts. Our findings demonstrate the relevance of common variants to African Americans at four loci previously associated with PR interval in European and Asian samples and identify an association signal at one of these loci that is more strongly associated with PR interval in African Americans than in Europeans

Lund University Publications

KOPS - The Institutional Repository of the University of Konstanz

Microarray-Based Maps of Copy-Number Variant Regions in European and Sub-Saharan Populations

Author: A Rovelet-Lecrux
AJ Sharp
Andreas Huber
Andreas Papassotiropoulos
Attila Stetak
BE Stranger
Benno Röthlisberger
Bianca Auschra
C Xie
Christian Vogler
Dominique J.-F. de Quervain
E Gonzalez
ES Venkatraman
I Filges
Iris-Tatjana Kolassa
Isabel Filges
J Sebat
JA Bailey
JM Korn
JR Lupski
K Wang
L Feuk
L Winchester
Leo Gschwind
LJ Handley
LP Onyut
M Jakobsson
M Via
NA Rosenberg
NM Solomon
P Cahan
Peter Miny
Philippe Demougin
PJ Hastings
PM Kim
R Redon
SA McCarroll
SA Tishkoff
SB Gabriel
Thomas Elbert
Thomas Mailund
TL Yang
V Goidts
Vanja Vukojevic
YY Teo
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

The genetic basis of phenotypic variation can be partially explained by the presence of copy-number variations (CNVs). Currently available methods for CNV assessment include high-density single-nucleotide polymorphism (SNP) microarrays that have become an indispensable tool in genome-wide association studies (GWAS). However, insufficient concordance rates between different CNV assessment methods call for cautious interpretation of results from CNV-based genetic association studies. Here we provide a cross-population, microarray-based map of copy-number variant regions (CNVRs) to enable reliable interpretation of CNV association findings. We used the Affymetrix Genome-Wide Human SNP Array 6.0 to scan the genomes of 1167 individuals from two ethnically distinct populations (Europe, N = 717; Rwanda, N = 450). Three different CNV-finding algorithms were tested and compared for sensitivity, specificity, and feasibility. Two algorithms were subsequently used to construct CNVR maps, which were also validated by processing subsamples with additional microarray platforms (Illumina 1M-Duo BeadChip, Nimblegen 385K aCGH array) and by comparing our data with publicly available information. Both algorithms detected a total of 42669 CNVs, 74% of which clustered in 385 CNVRs of a cross-population map. These CNVRs overlap with 862 annotated genes and account for approximately 3.3% of the haploid human genome

edoc

Parallel Adaptive Divergence among Geographically Diverse Human Populations

Author: A Sakuntabhai
AB Paaby
AC Thomas
AG Clark
Akey
AM Hancock
BS Weir
D Garrigan
DL Stern
DW Huang
E Patin
G Coop
Garland T Jr
Greg Gibson
H Nan
H Shimodaira
HM Cann
Hudson
J Arendt
J Felsenstein
J Flint
J González
Jacob A. Tennessen
JG Oakeshott
JK Pickrell
JK Pritchard
JM Smith
Joshua M. Akey
JP Bollback
JZ Li
L Gerardino
M Fumagalli
M Manceau
MJ Nadeau
N Gompel
NJ Fagundes
P Ralph
PA Hohenlohe
PF Colosimo
R Dualan
RD Barrett
RD Hernandez
RL Lamason
RS Devon
S Biswas
S Nejentsev
SA Tishkoff
SF Schaffner
SM Rogers
Y Jamshidi
Y Li
Publication venue: Public Library of Science
Publication date: 01/06/2011
Field of study

Few genetic differences between human populations conform to the classic model of positive selection, in which a newly arisen mutation rapidly approaches fixation in one lineage, suggesting that adaptation more commonly occurs via moderate changes in standing variation at many loci. Detecting and characterizing this type of complex selection requires integrating individually ambiguous signatures across genomically and geographically extensive data. Here, we develop a novel approach to test the hypothesis that selection has favored modest divergence at particular loci multiple times in independent human populations. We find an excess of SNPs showing non-neutral parallel divergence, enriched for genic and nonsynonymous polymorphisms in genes encompassing diverse and often disease related functions. Repeated parallel evolution in the same direction suggests common selective pressures in disparate habitats. We test our method with extensive coalescent simulations and show that it is robust to a wide range of demographic events. Our results demonstrate phylogenetically orthogonal patterns of local adaptation caused by subtle shifts at many widespread polymorphisms that likely underlie substantial phenotypic diversity

Patterns of Ancestry, Signatures of Natural Selection, and Genetic Association with Stature in Western African Pygmies

Author: A La Batide-Alanore
A Leonhardt
AB Migliano
AL Price
AL Price
Alain Froment
AR Boyko
B Pasaniuc
Bart Ferwerda
BF Voight
BS Weir
C Ballard
C Batini
CA Winkler
CC Khor
CD Huff
Charla Lambert
D Lopez Herraez
D Philipson
D Redelman
DL Rimoin
E Patin
G Baumann
G Destro-Bisol
GA McVean
Gabriel Hoffman
GH Perry
H Eleftherohorinou
H Innan
H Lango Allen
H Tang
H Yasukawa
HJ Bandelt
HM Kang
J Chen
J Kamath
Jason Mezey
JD Storey
JE Pool
Jean-Marie Bodo
JK Pickrell
JK Pritchard
JM Akey
JM Kidd
Joseph P. Jarvis
Joshua M. Akey
JZ Li
K Bryc
K Tang
L Quintana-Murci
Larsson Omberg
Laura B. Scheinfeldt
LG Moore
LJ Young
M Bozzola
M Joron
M Pelican
M Stephens
M Stephens
M Stephens
MB Lanktree
MD Shriver
MG de Silva
N Davila
NS Becker
P Librado
P Moorjani
P Scheet
P Verdu
PA Fujita
PC Sabeti
PR Dormitzer
R Chakraborty
R Kimura
S Jain
S Ludwig
S Purcell
S Sankararaman
SA Miller
SA Tishkoff
Sameer Soi
Sarah A. Tishkoff
SH Williamson
SJ Kang
ST Sherry
TJ Merimee
TJ Merimee
TJ Merimee
TJ Pemberton
William Beggs
WS Alexander
Y Benjamini
Y Chen
Y Hattori
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

African Pygmy groups show a distinctive pattern of phenotypic variation, including short stature, which is thought to reflect past adaptation to a tropical environment. Here, we analyze Illumina 1M SNP array data in three Western Pygmy populations from Cameroon and three neighboring Bantu-speaking agricultural populations with whom they have admixed. We infer genome-wide ancestry, scan for signals of positive selection, and perform targeted genetic association with measured height variation. We identify multiple regions throughout the genome that may have played a role in adaptive evolution, many of which contain loci with roles in growth hormone, insulin, and insulin-like growth factor signaling pathways, as well as immunity and neuroendocrine signaling involved in reproduction and metabolism. The most striking results are found on chromosome 3, which harbors a cluster of selection and association signals between approximately 45 and 60 Mb. This region also includes the positional candidate genes DOCK3, which is known to be associated with height variation in Europeans, and CISH, a negative regulator of cytokine signaling known to inhibit growth hormone-stimulated STAT5 signaling. Finally, pathway analysis for genes near the strongest signals of association with height indicates enrichment for loci involved in insulin and insulin-like growth factor signaling

Horizon / Pleins textes

FigShare

Cryptic Distant Relatives Are Common in Both Isolated and Cosmopolitan Genetic Samples

Author: A Albrechtsen
A Auton
A Gusev
A Kitchen
A Kong
A Price
B Derrida
B McEvoy
BL Browning
BM Henn
Brenna M. Henn
C O'Dushlaine
CD Huff
CR Gignoux
D Behar
D Rohde
FS Alkuraya
G Atzmon
G Leibon
G Malecot
Henry Harpending
I Moltke
Itsik Pe'er
J Li
J Novembre
J. Michael Macpherson
JL Mountain
JM Macpherson
Joanna L. Mountain
L Scott
L Weiss
Lawrence Hon
M Epstein
M Kirin
M Nalls
M Slatkin
M Zlojutro
N Rosenberg
N Rosenberg
N Rosenberg
Nick Eriksson
R McQuillan
RR Hudson
S Browning
S Ramachandran
S Tishkoff
S Wang
Serge Saxonov
SR Browning
W Bodmer
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Although a few hundred single nucleotide polymorphisms (SNPs) suffice to infer close familial relationships, high density genome-wide SNP data make possible the inference of more distant relationships such as 2nd to 9th cousinships. In order to characterize the relationship between genetic similarity and degree of kinship given a timeframe of 100–300 years, we analyzed the sharing of DNA inferred to be identical by descent (IBD) in a subset of individuals from the 23andMe customer database (n = 22,757) and from the Human Genome Diversity Panel (HGDP-CEPH, n = 952). With data from 121 populations, we show that the average amount of DNA shared IBD in most ethnolinguistically-defined populations, for example Native American groups, Finns and Ashkenazi Jews, differs from continentally-defined populations by several orders of magnitude. Via extensive pedigree-based simulations, we determined bounds for predicted degrees of relationship given the amount of genomic IBD sharing in both endogamous and ‘unrelated’ population samples. Using these bounds as a guide, we detected tens of thousands of 2nd to 9th degree cousin pairs within a heterogenous set of 5,000 Europeans. The ubiquity of distant relatives, detected via IBD segments, in both ethnolinguistic populations and in large ‘unrelated’ populations samples has important implications for genetic genealogy, forensics and genotype/phenotype mapping studies

CiteSeerX

eScholarship - University of California

Chapman University Digital Commons

FigShare

Polymorphisms in genes of interleukin 12 and its receptors and their association with protection against severe malarial anaemia in children in western Kenya

Abstract Background: Malarial anaemia is characterized by destruction of malaria infected red blood cells and suppression of erythropoiesis. Interleukin 12 (IL12) significantly boosts erythropoietic responses in murine models of malarial anaemia and decreased IL12 levels are associated with severe malarial anaemia (SMA) in children. Based on the biological relevance of IL12 in malaria anaemia, the relationship between genetic polymorphisms of IL12 and its receptors and SMA was examined. Methods: Fifty-five tagging single nucleotide polymorphisms covering genes encoding two IL12 subunits, IL12A and IL12B, and its receptors, IL12RB1 and IL12RB2, were examined in a cohort of 913 children residing in Asembo Bay region of western Kenya. Results: An increasing copy number of minor variant (C) in IL12A (rs2243140) was significantly associated with a decreased risk of SMA (P = 0.006; risk ratio, 0.52 for carrying one copy of allele C and 0.28 for two copies). Individuals possessing two copies of a rare variant (C) in IL12RB1 (rs429774) also appeared to be strongly protective against SMA (P = 0.00005; risk ratio, 0.18). In addition, children homozygous for another rare allele (T) in IL12A (rs22431348) were associated with reduced risk of severe anaemia (SA) (P = 0.004; risk ratio, 0.69) and of severe anaemia with any parasitaemia (SAP) (P = 0.004; risk ratio, 0.66). In contrast, AG genotype for another variant in IL12RB1 (rs383483) was associated with susceptibility to high-density parasitaemia (HDP) (P = 0.003; risk ratio, 1.21). Conclusions: This study has shown strong associations between polymorphisms in the genes of IL12A and IL12RB1 and protection from SMA in Kenyan children, suggesting that human genetic variants of IL12 related genes may significantly contribute to the development of anaemia in malaria patients

LSTM Online Archive

Springer - Publisher Connector

Impact of Selection and Demography on the Diffusion of Lactase Persistence

Author: A Gotherstrom
A Prevosti
A Sabbagh
A Sanchez-Mazas
A Sanchez-Mazas
A Sanchez-Mazas
A Sanchez-Mazas
AK Roychoudhury
Alicia Sanchez-Mazas
C Holden
C Renfrew
CJ Edwards
CJ Ingram
Céline Moret
D Helmer
D Meyer
Dennis O'Rourke
DM Swallow
EM Belle
FJ Simoons
G Flatz
G Flatz
G Malécot
G Mantel
GC Cook
GC Cook
GP Murdock
I Dupanloup
J Burger
JE Bowman
JM Travis
JT Troelsen
L Excoffier
LL Cavalli-Sforza
M Coelho
M Currat
M Currat
M Nei
M Zvelebil
MA Beaumont
Mathias Currat
N Ray
NS Enattah
NS Enattah
OD Solberg
Pascale Gerbault
R Pinhasi
RD McCracken
RR Sokal
S Klopfstein
SA Tishkoff
SJ Mack
T Bersaglieri
TB Gage
VC Sousa
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

BACKGROUND: The lactase enzyme allows lactose digestion in fresh milk. Its activity strongly decreases after the weaning phase in most humans, but persists at a high frequency in Europe and some nomadic populations. Two hypotheses are usually proposed to explain the particular distribution of the lactase persistence phenotype. The gene-culture coevolution hypothesis supposes a nutritional advantage of lactose digestion in pastoral populations. The calcium assimilation hypothesis suggests that carriers of the lactase persistence allele(s) (LCT*P) are favoured in high-latitude regions, where sunshine is insufficient to allow accurate vitamin-D synthesis. In this work, we test the validity of these two hypotheses on a large worldwide dataset of lactase persistence frequencies by using several complementary approaches. METHODOLOGY: We first analyse the distribution of lactase persistence in various continents in relation to geographic variation, pastoralism levels, and the genetic patterns observed for other independent polymorphisms. Then we use computer simulations and a large database of archaeological dates for the introduction of domestication to explore the evolution of these frequencies in Europe according to different demographic scenarios and selection intensities. CONCLUSIONS: Our results show that gene-culture coevolution is a likely hypothesis in Africa as high LCT*P frequencies are preferentially found in pastoral populations. In Europe, we show that population history played an important role in the diffusion of lactase persistence over the continent. Moreover, selection pressure on lactase persistence has been very high in the North-western part of the continent, by contrast to the South-eastern part where genetic drift alone can explain the observed frequencies. This selection pressure increasing with latitude is highly compatible with the calcium assimilation hypothesis while the gene-culture coevolution hypothesis cannot be ruled out if a positively selected lactase gene was carried at the front of the expansion wave during the Neolithic transition in Europe