Search CORE

24 research outputs found

Identifying a High Fraction of the Human Genome to be under Selective Constraint Using GERP++

Author: A Siepel
APL NM Dempster
Arend Sidow
B Rhead
David L. Goode
E Birney
EH Margulies
EH Margulies
Eugene V. Davydov
F Hsu
GM Cooper
Gregory M. Cooper
GT McVean
J Felsenstein
KS Pollard
M Blanchette
M Garber
M Hasegawa
M Kimura
Marina Sirota
RP Brent
Serafim Batzoglou
SF Altschul
SS Gross
T Jukes
WH Press
WJ Kent
Wyeth W. Wasserman
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Computational efforts to identify functional elements within genomes leverage comparative sequence information by looking for regions that exhibit evidence of selective constraint. One way of detecting constrained elements is to follow a bottom-up approach by computing constraint scores for individual positions of a multiple alignment and then defining constrained elements as segments of contiguous, highly scoring nucleotide positions. Here we present GERP++, a new tool that uses maximum likelihood evolutionary rate estimation for position-specific scoring and, in contrast to previous bottom-up methods, a novel dynamic programming approach to subsequently define constrained elements. GERP++ evaluates a richer set of candidate element breakpoints and ranks them based on statistical significance, eliminating the need for biased heuristic extension techniques. Using GERP++ we identify over 1.3 million constrained elements spanning over 7% of the human genome. We predict a higher fraction than earlier estimates largely due to the annotation of longer constrained elements, which improves one to one correspondence between predicted elements with known functional sequences. GERP++ is an efficient and effective tool to provide both nucleotide- and element-level constraint scores within deep multiple sequence alignments

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Processing and analyzing multiple genomes alignments with MafFilter

Author: A Scally
Aaron E. Darling
CC Chang
Danecek P Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R, 1000 Genomes Project Analysis Group
DG Higgins
EH Stukenbrock
EH Stukenbrock
J Casper
J Felsenstein
JB Lack
K Katoh
K Prüfer
L Duret
M Blanchette
M Hasegawa
M Hasegawa
M Slatkin
O Gascuel
S Guindon
S Kurtz
S Myers
S Schiffels
S Schwartz
SM Kiełbasa
SV Angiuoli
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/01/2020
Field of study

As the number of available genome sequences from both closely related species and individuals withinspecies increased, theoretical and methodological convergences between the fields of phylogenomics andpopulation genomics emerged. Population genomics typically focuses on the analysis of variants, whilephylogenomics heavily relies on genome alignments. However, these are playing an increasingly importantrole in studies at the population level. Multiple genome alignments of individuals are used when structuralvariation is of primary interest and when genome architecture permits to assemblede novogenomesequences. Here I describe MafFilter, a command-line-driven program allowing to process genome align-ments in the Multiple Alignment Format (MAF). Using concrete examples based on publicly availabledatasets, I demonstrate how MafFilter can be used to develop efficient and reproducible pipelines withquality assurance for downstream analyses. I further show how MafFilter can be used to perform both basicand advanced population genomic analyses in order to infer the patterns of nucleotide diversity alonggenomes

Crossref

MPG.PuRe

Global assessment of genomic variation in cattle by genome resequencing and high-throughput genotyping

Author: A Ritz
AG Clark
AJ Iafrate
AR Quinlan
AR Quinlan
AV Zimin
AW Pang
Bo Thomsen
Bujie Zhan
C Alkan
C Spillane
C Xie
CA Albers
CA Heid
CG Elsik
Christian Bendixen
D Pushkarev
DA Wheeler
DF Conrad
DG Lemay
DJ de Koning
DM Larkin
DR Bentley
E Seroussi
EM Ibeagha-Awemu
ER Mardis
F Zhang
Frank Panitz
G Dennis Jr
G Lunter
GE Liu
GE Liu
GM Church
GP Consortium
GP Harhay
GT McVean
GT McVean
H Li
H Li
H Li
H Li
H Park
HB Fraser
J Eid
J Fadista
J Fadista
J Sebat
J Wang
Jakob Hedegaard
JC Dohm
JI Kim
João Fadista
JR Lupski
JS Bae
JW Drake
K Chen
K Wang
K Wong
K Ye
KJ McKernan
KU Mir
LA Hindorff
LK Matukumalli
LW Hillier
M Kirin
M Perez-Enciso
MA Taub
ME Goddard
ML Metzker
MW Nachman
O Harismendy
P Medvedev
P Stankiewicz
P Tong
PC Ng
PC Ng
R Kawahara-Miki
R Nielsen
R Redon
RA Cartwright
RA Gibbs
RE Mills
RL Tellam
S Levy
S Yoon
SC Schuster
SH Eck
SM Ahn
T Meuwissen
TH Meuwissen
V Ramensky
V Whan
V Yuzbasiyan-Gurkan
Y Erlich
Y Hou
Y Li
YS Ju
YS Ju
ZL Hu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Integration of genomic variation with phenotypic information is an effective approach for uncovering genotype-phenotype associations. This requires an accurate identification of the different types of variation in individual genomes. Results We report the integration of the whole genome sequence of a single Holstein Friesian bull with data from single nucleotide polymorphism (SNP) and comparative genomic hybridization (CGH) array technologies to determine a comprehensive spectrum of genomic variation. The performance of resequencing SNP detection was assessed by combining SNPs that were identified to be either in identity by descent (IBD) or in copy number variation (CNV) with results from SNP array genotyping. Coding insertions and deletions (indels) were found to be enriched for size in multiples of 3 and were located near the N- and C-termini of proteins. For larger indels, a combination of split-read and read-pair approaches proved to be complementary in finding different signatures. CNVs were identified on the basis of the depth of sequenced reads, and by using SNP and CGH arrays. Conclusions Our results provide high resolution mapping of diverse classes of genomic variation in an individual bovine genome and demonstrate that structural variation surpasses sequence variation as the main component of genomic variability. Better accuracy of SNP detection was achieved with little loss of sensitivity when algorithms that implemented mapping quality were used. IBD regions were found to be instrumental for calculating resequencing SNP accuracy, while SNP detection within CNVs tended to be less reliable. CNV discovery was affected dramatically by platform resolution and coverage biases. The combined data for this study showed that at a moderate level of sequencing coverage, an ensemble of platforms and tools can be applied together to maximize the accurate detection of sequence and structural variants.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data

Author: A Keinan
A Kitchen
AG Clark
AJ Drummond
AL Caicedo
AL Price
AM Adams
AR Boyko
AS Kondrashov
BF Voight
C Becquet
C Wiuf
Carlos D. Bustamante
CD Bustamante
CJ Mulligan
D Garrigan
DA Pierce
DG Hwang
GA Watterson
Gil McVean
GT Marth
GV Kryukov
J Hey
J Hey
J Wakeley
JD Hunter
JD Wall
JG Heinrich
JK Pritchard
JM Akey
JM Braverman
JN Fenner
JS Chang
JZ Li
L Zhu
M Cox
M Jakobsson
M Kimura
M Tremblay
ME Weale
N Patterson
P Mellars
R Nielsen
R Nielsen
R Nielsen
RA Fischer
RD Hernandez
RD Hernandez
RJ Livingston
RR Hudson
Ryan D. Hernandez
Ryan N. Gutenkunst
S Kumar
S Myers
SA Sawyer
Scott H. Williamson
SF Schaffner
SH Williamson
T Goebel
T Nagylaki
TE Oliphant
TE Oliphant
WH Press
WJ Ewens
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 04/09/2009
Field of study

Demographic models built from genetic data play important roles in illuminating prehistorical events and serving as null models in genome scans for selection. We introduce an inference method based on the joint frequency spectrum of genetic variants within and between populations. For candidate models we numerically compute the expected spectrum using a diffusion approximation to the one-locus two-allele Wright-Fisher process, involving up to three simultaneous populations. Our approach is a composite likelihood scheme, since linkage between neutral loci alters the variance but not the expectation of the frequency spectrum. We thus use bootstraps incorporating linkage to estimate uncertainties for parameters and significance values for hypothesis tests. Our method can also incorporate selection on single sites, predicting the joint distribution of selected alleles among populations experiencing a bevy of evolutionary forces, including expansions, contractions, migrations, and admixture. As applications, we model human expansion out of Africa and the settlement of the New World, using 5 Mb of noncoding DNA resequenced in 68 individuals from 4 populations (YRI, CHB, CEU, and MXL) by the Environmental Genome Project. We also combine our demographic model with a previously estimated distribution of selective effects among newly arising amino acid mutations to accurately predict the frequency spectrum of nonsynonymous variants across three continental populations (YRI, CHB, CEU).Comment: 17 pages, 4 figures, supporting information included with sourc

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Recommended from our members

Human genomic regions with exceptionally high levels of population differentiation identified from 911 whole-genome sequences

Author: Abecasis GR
Albrecht MW
Altshuler DM
Amstislavskiy VS
Auton A
Ayub Q
Balasubramaniam S
Bentley DR
Borodina TA
Brooks LD
Burton J
Chakravarti A
Chen Y
Clark AG
Clarke L
Colonna V
Danecek P
Depristo MA
Dinh H
Donnelly P
Durbin RM
Eichler EE
Fang X
Flicek P
Fulton L
Fulton R
Gabriel SB
Garrison E
Gibbs RA
Green ED
Grocock R
Guo X
Gupta N
Handsaker RE
Humphray S
Hurles ME
James T
Jian M
Jiang H
Jin X
Kang HM
Keane TM
Kingsbury Z
Knoppers BM
Kolb-Kokocinski A
Korbel JO
Kovar C
Lander ES
Lee C
Lee S
Lehrach H
Leinonen R
Lewis L
Li G
Li J
Li Y
Li Z
Lienhard M
Liu X
Lu Y
Luisi P
Ma X
Mardis ER
Marth GT
McCarthy S
McVean G
McVean GA
Mertes F
Muzny D
Nickerson DA
Pagani L
Pybus M
Reid J
Schmidt JP
Sherry ST
Smith RE
Su Z
Sudbrak R
Sultan M
Tai S
Tang M
Timmermann B
Tyler-Smith C
Wang B
Wang G
Wang J
Wang M
Weinstock GM
Wilson RK
Wu H
Wu R
Xue Y
Yaspo ML
Yin Y
Zhang W
Zhao J
Zhao M
Zheng X
Zheng-Bradley X
Zhou Y
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/06/2014
Field of study

Data availability: The 1000 Genomes phase I integrated callset used in this study is publicly available at [The 1000 Genomes Project. http://www.1000genomes.org/]. For a list of samples used in this study, refer to Table S1 in Additional file 1. Additional files: Additional file 1: This file contains supplementary Tables ST1 to ST4 (https://static-content.springer.com/esm/art%3A10.1186%2Fgb-2014-15-6-r88/MediaObjects/13059_2014_3364_MOESM1_ESM.xlsx). Additional file 2: This file contains supplementary Figures F1 to F16 (https://static-content.springer.com/esm/art%3A10.1186%2Fgb-2014-15-6-r88/MediaObjects/13059_2014_3364_MOESM2_ESM.pdf). Additional file 3: Full list of participants and institutions in the 1000 Genomes Project (https://static-content.springer.com/esm/art%3A10.1186%2Fgb-2014-15-6-r88/MediaObjects/13059_2014_3364_MOESM3_ESM.pdf).Copyright © 2014 Colonna et al. Background: Population differentiation has proved to be effective for identifying loci under geographically localized positive selection, and has the potential to identify loci subject to balancing selection. We have previously investigated the pattern of genetic differentiation among human populations at 36.8 million genomic variants to identify sites in the genome showing high frequency differences. Here, we extend this dataset to include additional variants, survey sites with low levels of differentiation, and evaluate the extent to which highly differentiated sites are likely to result from selective or other processes. Results: We demonstrate that while sites with low differentiation represent sampling effects rather than balancing selection, sites showing extremely high population differentiation are enriched for positive selection events and that one half may be the result of classic selective sweeps. Among these, we rediscover known examples, where we actually identify the established functional SNP, and discover novel examples including the genes ABCA12, CALD1 and ZNF804, which we speculate may be linked to adaptations in skin, calcium metabolism and defense, respectively. Conclusions: We identify known and many novel candidate regions for geographically restricted positive selection, and suggest several directions for further research.The Wellcome Trust (098051), an Italian National Research Council (CNR) short-term mobility fellowship from the 2013 program to VC, and an EMBO Short Term Fellowship ASTF 324–2010 to VC

Brunel University Research Archive

The evolution of genomic imprinting:Theories, predictions and empirical tests

Author: A Burt
AJ Moore
AJ Moore
AO Urrutia
B Holland
B Holland
B Hutter
B Sinervo
B Tier
C Proudhon
C Spillane
CI Castillo-Davis
D Autran
D C Queller
D Haig
D Haig
D Haig
D Haig
D Haig
D Haig
D Haig
D Haig
D Haig
D Haig
DC Queller
DJ de Koning
F Úbeda
F Úbeda
F Úbeda
F Úbeda
GT McVean
H Akashi
HG Spencer
J B Wolf
J P Curley
J Van Cleve
JB Wolf
JB Wolf
JB Wolf
JB Wolf
JB Wolf
JB Wolf
JB Wolf
JB Wolf
JB Wolf
JB Wolf
JF Wilkins
JF Wilkins
JF Wilkins
JF Wilkins
JK Killian
JM Stringer
JP Curley
JW McGlothlin
K Foerster
KM Glastad
L Parker-Katiraee
L Ross
M Gehring
M M Patten
M Zhang
MB Renfree
MJ O'Connell
NGC Smith
P Innocenti
P Wolff
PC McKeown
R Bonduriansky
R Bonduriansky
R Bonduriansky
R Hager
RA Drewell
SA Frank
SV Yi
T Connallon
T Day
T Miyake
T Moore
TA Mousseau
X Wang
Y Brandvain
Y Ikeda
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/04/2014
Field of study

The epigenetic phenomenon of genomic imprinting has motivated the development of numerous theories for its evolutionary origins and genomic distribution. In this review, we examine the three theories that have best withstood theoretical and empirical scrutiny. These are: Haig and colleagues’ kinship theory; Day and Bonduriansky’s sexual antagonism theory; and Wolf and Hager’s maternal–offspring coadaptation theory. These theories have fundamentally different perspectives on the adaptive significance of imprinting. The kinship theory views imprinting as a mechanism to change gene dosage, with imprinting evolving because of the differential effect that gene dosage has on the fitness of matrilineal and patrilineal relatives. The sexual antagonism and maternal–offspring coadaptation theories view genomic imprinting as a mechanism to modify the resemblance of an individual to its two parents, with imprinting evolving to increase the probability of expressing the fitter of the two alleles at a locus. In an effort to stimulate further empirical work on the topic, we carefully detail the logic and assumptions of all three theories, clarify the specific predictions of each and suggest tests to discriminate between these alternative theories for why particular genes are imprinted

OPUS

Crossref

Washington University St. Louis: Open Scholarship

PubMed Central

Edinburgh Research Explorer

A global reference for human genetic variation

Author: Abecasis GR
Abecasis GR
Abecasis GR
Abecasis GR
Abecasis GR
Abecasis GR
Abecasis GR
Abyzov A
Abyzov A
Albers CA
Albrecht MW
Albrecht MW
Alkan C
Alkan C
Altshuler DM
Altshuler DM
Altshuler DM
Altshuler DM
Altshuler DM
Amstislavskiy VS
Amstislavskiy VS
Ananiev V
Antaki D
Antaki D
Antunes L
Asogun D
Auton A
Auton A
Auton A
Awadalla P
Ayub Q
Ayub Q
Bafna V
Bainbridge M
Bainbridge M
Balasubramaniam S
Balasubramaniam S
Balasubramaniam S
Balasubramanian S
Balasubramanian S
Balasubramanian S
Ball EV
Banerjee R
Banks E
Banks E
Baran Y
Barker J
Barnes B
Barnes B
Barnes KC
Barnes KC
Batzer MA
Batzer MA
Bauer M
Beal K
Bedoya G
Beiswanger C
Belaia Z
Beloslyudtsev D
Bentley DR
Bentley DR
Bentley DR
Bentley DR
Bentley DR
Bhatia G
Blackburne B
Blackwell T
Bodmer W
Boerwinkle E
Borodina TA
Bouk N
Brook LD
Brooks LD
Browning BL
Browning SR
Burchard EG
Burchard EG
Burton J
Bustamante CD
Bustamante CD
Bustamante CD
Bustamante CD
Byrnes JK
Cai H
Cai Z
Campbell CL
Cao H
Caron S
Carroll AW
Casale FP
Cerezo M
Cerveira E
Cerveira E
Cerveira E
Chaisson MJ
Chakravarti A
Chakravarti A
Challis D
Chang Y
Cheetham RK
Cheetham RK
Chen C
Chen J
Chen J
Chen K
Chen T
Chen W
Chen Y
Chen Y
Chen Y
Chew E
Chong Z
Christoforides A
Christoforides A
Chu J
Church D
Church D
Churchhouse C
Clark AG
Clark AG
Clark AG
Clarke D
Clarke D
Clarke L
Clarke L
Clarke L
Clarke L
Clarke L
Clarke L
Clarke L
Cohen R
Coin L
Coin LJM
Colonna V
Colonna V
Cook C
Cooper DN
Corrah T
Cox A
Cox A
Craig DW
Craig DW
Craig DW
Craig DW
Cunningham F
Dal E
Dal E
Daly MJ
Dan X
Danecek P
Danecek P
Datta A
Davies CJ
Dayama G
De La Vega FM
Degenhardt J
DeGorter MK
DeGorter MK
del Angel G
Del Angel G
del Angel G
Delaneau O
Deng X
DePristo MA
DePristo MA
Dermitzakis ET
Dermitzakis ET
Desalle R
Devine SE
Devine SE
Ding L
Ding L
Doddapaneni H
Donnelly P
Duncanson A
Dunham I
Dunn M
Dunstan SJ
Durbin RM
Durbin RM
Durbin RM
Durbin RM
Durbin RM
Durbin RM
Dutil J
Eberle M
Eberle M
Eichler EE
Eichler EE
Eichler EE
Emery S
Emery S
Erlich Y
Erlich Y
Evani US
Fan X
Fang L
Fang X
Felsenfeld A
Feng Q
Fitzgerald T
Flicek P
Flicek P
Flicek P
Flicek P
Flicek P
Flicek P
Flicek P
Flicek P
Folarin O
Fonnie R
Fritsche L
Fritz MH
Fritz MH
Fu Y
Fu Y
Fu Y
Fuchsberger C
Fulton L
Fulton R
Fulton R
Gabriel SB
Gabriel SB
Gabriel SB
Gabriel SB
Gallo C
Gao Y
Garcia-Montero A
Gardner EJ
Garner J
Garrison EP
Garrison EP
Garrison EP
Garrison EP
Garrison EP
Garry R
Genovese G
Genovese G
Gerry NP
Gerry NP
Gerstein MB
Gerstein MB
Gerstein MB
Gerstein MB
Gharani N
Gharani N
Gibbs RA
Gibbs RA
Gibbs RA
Gibbs RA
Gibbs RA
Gibbs RA
Gignoux CR
Gignoux CR
Gil L
Gollub J
Goncalo RA
Gottipati S
Grant DS
Gravel S
Gravel S
Gravel S
Green ED
Green ED
Grocock R
Gujral M
Guo X
Guo X
Guo X
Guo X
Gupta N
Gupta N
Gupta N
Gupta-Hinch A
Gymrek M
Gymrek M
Habegger L
Hale W
Hall I
Halperin E
Han Y
Handsaker RE
Handsaker RE
Handsaker RE
Happi C
Harmanci AO
Hartl C
Hartl C
Haussler D
Haussler D
Hefferon T
Henn B
Hennis A
Hernandez RD
Herrero J
Herwig R
Hodgkinson A
Homer N
Homer N
Homer N
Hormozdiari F
Hormozdiari F
Horn H
Howie B
Huang Z
Huddleston J
Humphray S
Humphray S
Humphray S
Humphray S
Hunt SE
Hurles ME
Hurles ME
Hurles ME
Hwang J
Hwang J
Hyland FCL
Iqbal Z
Izatt T
Izatt T
Izatt T
Jallow M
James T
Jespersen JB
Jian M
Jiang H
Jin M
Jin X
Jin X
Jones D
Joof FS
Jorde L
Jorde L
Jostins L
Jun G
Jun G
Kahn S
Kahn S
Kahn S
Kahveci F
Kahveci F
Kalra D
Kang HM
Kang HM
Kang HM
Kanneh L
Kashin S
Kashin S
Kashin S
Katzman SJ
Kaye JS
Keane TM
Keane TM
Keane TM
Keinan A
Keinan A
Kelman G
Kenny EE
Kent A
Kent WJ
Kerasidou A
Khurana E
Khurana E
Khurana E
Kidd JM
Kidd JM
Kim D
Kimelman M
Kingsbury Z
Knoppers BM
Knoppers BM
Koboldt DC
Koboldt DC
Kolb-Kokocinski A
Kong Y
Konkel MK
Konkel MK
Kooner J
Korbel JO
Korbel JO
Korbel JO
Korbel JO
Korchina V
Kovar C
Kovar C
Kovar C
Kretzschmar W
Kulesha E
Kural D
Kural D
Kurdoglu AA
Kurdoglu AA
Kwiatkowski D
Lacroute P
Lacroute P
Lage K
Lam H
Lameijer E-W
Lan T
Lander ES
Lander ES
Lander ES
Lappalainen T
LaRocque R
Larson D
Larson D
Lee C
Lee C
Lee C
Lee C
Lee D
Lee S
Lee W-P
Lee W-P
Lehrach H
Lehrach H
Leinonen R
Lek M
Leong WF
Leong WF
Li B
Li G
Li G
Li G
Li H
Li H
Li J
Li Q
Li W
Li Y
Li Y
Li Y
Li Y
Li Y
Li Y
Li Y
Li Z
Lienhard M
Lienhard M
Lihm J
Lihm J
Lin H
Lindsay SJ
Liu B
Liu C
Liu J
Liu S
Liu X
Liu X
Lopez J
Louzada S
Lu J
Lu Y
Lunter G
Lunter G
Luo R
Luo R
Lyons R
Ma X
MacArthur DG
MacArthur DG
Makarov V
Makarov V
Malhotra A
Malhotra A
Malhotra A
Malig M
Maples BK
Marchini JL
Marchini JL
Marcketta A
Marcketta A
Mardis ER
Mardis ER
Mardis ER
Mardis ER
Marth GT
Marth GT
Marth GT
Marth GT
Martin AR
Martinez-Cruzado JC
Massaia A
Mathias R
Mathias RA
Mathieson I
McCarroll SA
McCarroll SA
McCarroll SA
McCarthy S
McCarthy S
McCarthy S
McCarthy S
McCarthy S
McEwen JE
McKenzie C
McLaren WM
McLaren WM
McVean GA
McVean GA
McVean GA
McVean GA
McVean GA
McVean GA
McVean GA
McVean GA
Meiers S
Mendez FL
Menelaou A
Meric P
Mertes F
Michaelson J
Mills RE
Mittelman D
Montgomery SB
Montgomery SB
Moreno-Estrada A
Moreno-Estrada A
Moses L
Mu XJ
Mu XJ
Murray L
Murray L
Muzny D
Muzny D
Muzny D
Muzny D
Myers S
Nagaswamy U
Narechania A
Nelson BJ
Nemesh JC
Nemesh JC
Nguyen TH
Nickerson DA
Ning Z
Noor A
O'Sullivan C
Oleksyk TK
Oleksyk TK
Omoniwa O
Orfao A
Ossorio PN
Ostapchuk Y
Parker M
Parrish NF
Peden J
Peltonen L
Phan L
Plewczynski D
Plewczynski D
Poletti G
Ponomarov S
Poplin RE
Poplin RE
Poznik GD
Qadri F
Quail M
Quitadamo A
Quitadamo A
Radew K
Radew K
Radhakrishnan R
Raeder B
Rasheed A
Rausch T
Rausch T
Reid JG
Reid JG
Reid JG
Resch AM
Resch AM
Rimmer A
Ritchie GR
Ritchie GRS
Roa A
Rockett K
Rodriguez-Flores JL
Rodriguez-Flores JL
Rodriguez-Flores JL
Romanovitch M
Romanovitch M
Romanovitch M
Rosenfeld JA
Rotimi CN
Royal CD
Ruiz-Linares A
Ruiz-Linares A
Sabeti PC
Sabeti PC
Sabeti PC
Sabeti PC
Sabo A
Sabo A
Saleheen D
Sandoval K
Sayres MAW
Schaffner SF
Scheller C
Schieffelin J
Schloss JA
Schmidt JP
Schmidt JP
Schneider V
Sebat J
Sebat J
Shakir K
Shao H
Shao H
Shaw R
Shaw R
Shekhtman E
Sherry ST
Sherry ST
Sherry ST
Sherry ST
Sherry ST
Shi X
Shi X
Shlyakhter I
Shringarpure SS
Shriver MD
Sidore C
Simpson JT
Sinari SA
Sirotkin K
Sisu C
Sliwerska E
Slotta D
Smirnov D
Smith RE
Smith RE
Smith RE
Smith RE
Song S
Squire K
Stalker J
Stalker J
Stegle O
Stenson PD
Streeter I
Stremlau M
Stromberg M
Stuetz AM
Stuetz AM
Su Y
Sudbrak R
Sudbrak R
Sudbrak R
Sudbrak R
Sudmant PH
Sudmant PH
Sultan M
Swaroop A
Taliun D
Tan A
Tang M
Tariyal R
Thormann A
Tian Z
Timmermann B
Tishkoff S
Toji LH
Toji LH
Toneva I
Tran TH
Tyler-Smith C
Tyler-Smith C
Tyler-Smith C
Tyler-Smith C
Underhill PA
Vaughan B
Vaydylevich Y
Via M
Vitti J
Walker JA
Walker JA
Walter K
Walter K
Wang B
Wang G
Wang J
Wang J
Wang J
Wang J
Wang J
Wang Y
Ward AN
Ward AN
Ward AN
Watson H
Webster T
Welch R
Willems TF
Willems TF
Wilson RK
Wilson RK
Wing MK
Witherspoon D
Witherspoon D
Wong B
Wu H
Wu J
Wu J
Wu R
Wu R
Xiao C
Xiao C
Xiao C
Xiao C
Xie Y
Xifara DK
Xing J
Xing J
Xiong M
Xu X
Xue Y
Xue Y
Xue Y
Yang F
Yang H
Yang H
Yang L
Yaspo M-L
Ye C
Ye C
Ye K
Ye K
Ye K
Ye K
Yin Y
Yoon SC
Yoon SC
Yu C
Yu F
Yu F
Yu H
Yu J
Zakharia F
Zerbino D
Zerbino D
Zhan X
Zhan Y
Zhang C
Zhang C
Zhang C
Zhang D
Zhang F
Zhang H
Zhang J
Zhang J
Zhang M
Zhang M
Zhang W
Zhang Y
Zhang Y
Zhang Y
Zhao J
Zhao M
Zheng H
Zheng X
Zheng X
Zheng-Bradley X
Zheng-Bradley X
Zheng-Bradley X
Zheng-Bradley X
Zheng-Bradley X
Zheng-Bradley X
Zheng-Bradley X
Zhu H
Zhu H
Zhu J
Zhu J
Zhu Y
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.We thank the many people who were generous with contributing their samples to the project: the African Caribbean in Barbados; Bengali in Bangladesh; British in England and Scotland; Chinese Dai in Xishuangbanna, China; Colombians in Medellin, Colombia; Esan in Nigeria; Finnish in Finland; Gambian in Western Division – Mandinka; Gujarati Indians in Houston, Texas, USA; Han Chinese in Beijing, China; Iberian populations in Spain; Indian Telugu in the UK; Japanese in Tokyo, Japan; Kinh in Ho Chi Minh City, Vietnam; Luhya in Webuye, Kenya; Mende in Sierra Leone; people with African ancestry in the southwest USA; people with Mexican ancestry in Los Angeles, California, USA; Peruvians in Lima, Peru; Puerto Ricans in Puerto Rico; Punjabi in Lahore, Pakistan; southern Han Chinese; Sri Lankan Tamil in the UK; Toscani in Italia; Utah residents (CEPH) with northern and western European ancestry; and Yoruba in Ibadan, Nigeria. Many thanks to the people who contributed to this project: P. Maul, T. Maul, and C. Foster; Z. Chong, X. Fan, W. Zhou, and T. Chen; N. Sengamalay, S. Ott, L. Sadzewicz, J. Liu, and L. Tallon; L. Merson; O. Folarin, D. Asogun, O. Ikpwonmosa, E. Philomena, G. Akpede, S. Okhobgenin, and O. Omoniwa; the staff of the Institute of Lassa Fever Research and Control (ILFRC), Irrua Specialist Teaching Hospital, Irrua, Edo State, Nigeria; A. Schlattl and T. Zichner; S. Lewis, E. Appelbaum, and L. Fulton; A. Yurovsky and I. Padioleau; N. Kaelin and F. Laplace; E. Drury and H. Arbery; A. Naranjo, M. Victoria Parra, and C. Duque; S. Däkel, B. Lenz, and S. Schrinner; S. Bumpstead; and C. Fletcher-Hoppe. Funding for this work was from the Wellcome Trust Core Award 090532/Z/09/Z and Senior Investigator Award 095552/Z/11/Z (P.D.), and grants WT098051 (R.D.), WT095908 and WT109497 (P.F.), WT086084/Z/08/Z and WT100956/Z/13/Z (G.M.), WT097307 (W.K.), WT0855322/Z/08/Z (R.L.), WT090770/Z/09/Z (D.K.), the Wellcome Trust Major Overseas program in Vietnam grant 089276/Z.09/Z (S.D.), the Medical Research Council UK grant G0801823 (J.L.M.), the UK Biotechnology and Biological Sciences Research Council grants BB/I02593X/1 (G.M.) and BB/I021213/1 (A.R.L.), the British Heart Foundation (C.A.A.), the Monument Trust (J.H.), the European Molecular Biology Laboratory (P.F.), the European Research Council grant 617306 (J.L.M.), the Chinese 863 Program 2012AA02A201, the National Basic Research program of China 973 program no. 2011CB809201, 2011CB809202 and 2011CB809203, Natural Science Foundation of China 31161130357, the Shenzhen Municipal Government of China grant ZYC201105170397A (J.W.), the Canadian Institutes of Health Research Operating grant 136855 and Canada Research Chair (S.G.), Banting Postdoctoral Fellowship from the Canadian Institutes of Health Research (M.K.D.), a Le Fonds de Recherche duQuébec-Santé (FRQS) research fellowship (A.H.), Genome Quebec (P.A.), the Ontario Ministry of Research and Innovation – Ontario Institute for Cancer Research Investigator Award (P.A., J.S.), the Quebec Ministry of Economic Development, Innovation, and Exports grant PSR-SIIRI-195 (P.A.), the German Federal Ministry of Education and Research (BMBF) grants 0315428A and 01GS08201 (R.H.), the Max Planck Society (H.L., G.M., R.S.), BMBF-EPITREAT grant 0316190A (R.H., M.L.), the German Research Foundation (Deutsche Forschungsgemeinschaft) Emmy Noether Grant KO4037/1-1 (J.O.K.), the Beatriu de Pinos Program grants 2006 BP-A 10144 and 2009 BP-B 00274 (M.V.), the Spanish National Institute for Health Research grant PRB2 IPT13/0001-ISCIII-SGEFI/FEDER (A.O.), Ewha Womans University (C.L.), the Japan Society for the Promotion of Science Fellowship number PE13075 (N.P.), the Louis Jeantet Foundation (E.T.D.), the Marie Curie Actions Career Integration grant 303772 (C.A.), the Swiss National Science Foundation 31003A_130342 and NCCR “Frontiers in Genetics” (E.T.D.), the University of Geneva (E.T.D., T.L., G.M.), the US National Institutes of Health National Center for Biotechnology Information (S.S.) and grants U54HG3067 (E.S.L.), U54HG3273 and U01HG5211 (R.A.G.), U54HG3079 (R.K.W., E.R.M.), R01HG2898 (S.E.D.), R01HG2385 (E.E.E.), RC2HG5552 and U01HG6513 (G.T.M., G.R.A.), U01HG5214 (A.C.), U01HG5715 (C.D.B.), U01HG5718 (M.G.), U01HG5728 (Y.X.F.), U41HG7635 (R.K.W., E.E.E., P.H.S.), U41HG7497 (C.L., M.A.B., K.C., L.D., E.E.E., M.G., J.O.K., G.T.M., S.A.M., R.E.M., J.L.S., K.Y.), R01HG4960 and R01HG5701 (B.L.B.), R01HG5214 (G.A.), R01HG6855 (S.M.), R01HG7068 (R.E.M.), R01HG7644 (R.D.H.), DP2OD6514 (P.S.), DP5OD9154 (J.K.), R01CA166661 (S.E.D.), R01CA172652 (K.C.), P01GM99568 (S.R.B.), R01GM59290 (L.B.J., M.A.B.), R01GM104390 (L.B.J., M.Y.Y.), T32GM7790 (C.D.B., A.R.M.), P01GM99568 (S.R.B.), R01HL87699 and R01HL104608 (K.C.B.), T32HL94284 (J.L.R.F.), and contracts HHSN268201100040C (A.M.R.) and HHSN272201000025C (P.S.), Harvard Medical School Eleanor and Miles Shore Fellowship (K.L.), Lundbeck Foundation Grant R170-2014-1039 (K.L.), NIJ Grant 2014-DN-BX-K089 (Y.E.), the Mary Beryl Patch Turnbull Scholar Program (K.C.B.), NSF Graduate Research Fellowship DGE-1147470 (G.D.P.), the Simons Foundation SFARI award SF51 (M.W.), and a Sloan Foundation Fellowship (R.D.H.). E.E.E. is an investigator of the Howard Hughes Medical Institute

Cold Spring Harbor Laboratory Institutional Repository

Bilkent University Institutional Repository

Serveur académique lausannois

Louisiana State University

Carolina Digital Repository

Spiral - Imperial College Digital Repository

Online Research Database In Technology

MPG.PuRe

Brunel University Research Archive

HKU Scholars Hub

University of Queensland eSpace

Crossref