Search CORE

213 research outputs found

Reconstructing complex regions of genomes using long-read sequencing technology

Author: Alkan C.
Antonacci F.
Chaisson M.
Eichler E. E.
Hon L.
Huddleston J.
Malig M.
Ranade S.
Sudmant P. H.
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 09/01/2014
Field of study

Cataloged from PDF version of article.Obtaining high-quality sequence continuity of complex regions of recent segmental duplication remains one of the major challenges of finishing genome assemblies. In the human and mouse genomes, this was achieved by targeting large-insert clones using costly and laborious capillary-based sequencing approaches. Sanger shotgun sequencing of clone inserts, however, has now been largely abandoned, leaving most of these regions unresolved in newer genome assemblies generated primarily by next-generation sequencing hybrid approaches. Here we show that it is possible to resolve regions that are complex in a genome-wide context but simple in isolation for a fraction of the time and cost of traditional methods using long-read single molecule, real-time (SMRT) sequencing and assembly technology from Pacific Biosciences (PacBio). We sequenced and assembled BAC clones corresponding to a 1.3-Mbp complex region of chromosome 17q21.31, demonstrating 99.994% identity to Sanger assemblies of the same clones. We targeted 44 differences using Illumina sequencing and find that PacBio and Sanger assemblies share a comparable number of validated variants, albeit with different sequence context biases. Finally, we targeted a poorly assembled 766-kbp duplicated region of the chimpanzee genome and resolved the structure and organization for a fraction of the cost and time of traditional finishing approaches. Our data suggest a straightforward path for upgrading genomes to a higher quality finished state

Bilkent University Institutional Repository

Resolving the complexity of the human genome using single-molecule sequencing

Author: Antonacci F.
Boitano M.
Chaisson M. J. P.
Dennis M. Y.
Eichler E. E.
Hormozdiari F.
Huddleston J.
Hunkapiller M. W.
Korlach J.
Landolin J. M.
Malig M.
Sandstrom R.
Stamatoyannopoulos J. A.
Sudmant P. H.
Surti U.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

The human genome is arguably the most complete mammalian reference assembly, yet more than 160 euchromatic gaps remain and aspects of its structural variation remain poorly understood ten years after its completion. To identify missing sequence and genetic variation, here we sequence and analyse a haploid human genome (CHM1) using single-molecule, real-time DNA sequencing. We close or extend 55% of the remaining interstitial gaps in the human GRCh37 reference genome - 78% of which carried long runs of degenerate short tandem repeats, often several kilobases in length, embedded within (G+C)-rich genomic regions. We resolve the complete sequence of 26,079 euchromatic structural variants at the base-pair level, including inversions, complex insertions and long tracts of tandem repeats. Most have not been previously reported, with the greatest increases in sensitivity occurring for events less than 5 kilobases in size. Compared to the human reference, we find a significant insertional bias (3:1) in regions corresponding to complex insertions and long short tandem repeats. Our results suggest a greater complexity of the human genome in the form of variation of longer and more complex repetitive DNA that can now be largely resolved with the application of this longer-read sequencing technology

Archivio istituzionale della ricerca - Università di Bari

Extensive Copy-Number Variation of Young Genes across Stickleback Populations

Author: A Abyzov
A Alexa
A Conesa
A Hussain
AJ Iafrate
AJ Sharp
AJ Vilella
AR Boyko
AR Quinlan
B Guo
BE Deagle
C Eizaguirre
C Eizaguirre
Christophe Eizaguirre
CL McGrath
CL Peichel
D Bryant
D Juan
D Tautz
DE Cook
DH Huson
DJ Turner
DR Schrider
DR Schrider
DR Zerbino
E Gazave
E Proux
Erich Bornberg-Bauer
FA Kondrashov
FC Jones
Frédéric J. J. Chain
G Gibson
G Orti
GC Conant
GH Perry
GH Perry
GM Cooper
H Kehrer-Sawatzki
H Li
Irene E. Samonte
J Sebat
JA Fawcett
Jianzhi Zhang
JJ Emerson
JK Colbourne
JO Korbel
JO Korbel
K Chen
K Khalturin
K Ye
KJ Lipinski
KJ Livak
KM Teshima
KM Wegner
L Xu
LC Hsing
LR Saraiva
M Hiraiwa
M Long
M Long
M Lynch
M Lynch
M Milinski
M Roesti
MA DePristo
Mahesh Panchal
Manfred Milinski
Martin Kalbe
Monika Stoll
N Ghanem
P Danecek
P Flicek
P Sjödin
PA Hohenlohe
PGD Feulner
PH Sudmant
Philine G. D. Feulner
PM Kim
R Redon
RC Iskow
S Moretti
S Sawyer
SF Altschul
SH Williamson
SM Waszak
SR Browning
T Marques-Bonet
T Rausch
TD Schmittgen
Thorsten B. H. Reusch
Tobias L. Lenz
V Guryev
V Katju
V Katju
V Ranwez
X Huang
Y Hashiguchi
Y Hashiguchi
Y Zheng
YE Zhang
YF Chan
Z Yang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

MM received funding from the Max Planck innovation funds for this project. PGDF was supported by a Marie Curie European Reintegration Grant (proposal nr 270891). CE was supported by German Science Foundation grants (DFG, EI 841/4-1 and EI 841/6-1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

OceanRep

Crossref

Directory of Open Access Journals

PubMed Central

Queen Mary Research Online

Bern Open Repository and Information System (BORIS)

MPG.PuRe

FigShare

The birth of a human-specific neural gene by incomplete duplication and gene fusion

Author: A Fortna
AJ Sharp
BE Davy
BJ O’Roak
BJ O’Roak
BP Coe
Bradley J. Nelson
C Charrier
Carl Baker
D Reich
EA Boyle
Evan E. Eichler
F Antonacci
F Antonacci
F Cunningham
F Hach
F Hach
Francesca Antonacci
GTEx Consortium
GV Glazko
H Olbrich
HC Mefford
HR Dawe
I Lazaridis
J Felsenstein
J Harrow
J Huddleston
J Jun
J Prado-Martinez
JA Bailey
JA Bailey
JB Hiatt
JD Parsons
JD Thompson
John Huddleston
JP Bielawski
JR Lupski
K Prüfer
K Tamura
K Vandepoele
K-F Lechtreck
K-F Lechtreck
Lana Harshman
M Florio
M Kimura
M Kozak
M Kozak
M Lynch
M Meyer
M Nei
M O’Bleness
M O’Bleness
Mario Ventura
Max L. Dougherty
MC Popesco
Megan Y. Dennis
Michael H. Duyzend
MY Dennis
MY Dennis
N Brunetti-Pierri
N Saitou
NA Doggett
NL Bray
Osnat Penn
P Stankiewicz
PH Sudmant
PH Sudmant
PH Sudmant
PH Sudmant
Q Fu
R Bernier
RE Thurman
Richard Sandstrom
S Chen
S Girirajan
S John
S Ohno
T Kurosaki
T Marques-Bonet
T Ota
WJ Kent
WJ Kent
X Nuttle
Xander Nuttle
Z Jiang
Z Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Background: Gene innovation by duplication is a fundamental evolutionary process but is difficult to study in humans due to the large size, high sequence identity, and mosaic nature of segmental duplication blocks. The human-specific gene hydrocephalus-inducing 2, HYDIN2, was generated by a 364 kbp duplication of 79 internal exons of the large ciliary gene HYDIN from chromosome 16q22.2 to chromosome 1q21.1. Because the HYDIN2 locus lacks the ancestral promoter and seven terminal exons of the progenitor gene, we sought to characterize transcription at this locus by coupling reverse transcription polymerase chain reaction and long-read sequencing. Results: 5' RACE indicates a transcription start site for HYDIN2 outside of the duplication and we observe fusion transcripts spanning both the 5' and 3' breakpoints. We observe extensive splicing diversity leading to the formation of altered open reading frames (ORFs) that appear to be under relaxed selection. We show that HYDIN2 adopted a new promoter that drives an altered pattern of expression, with highest levels in neural tissues. We estimate that the HYDIN duplication occurred ~3.2 million years ago and find that it is nearly fixed (99.9%) for diploid copy number in contemporary humans. Examination of 73 chromosome 1q21 rearrangement patients reveals that HYDIN2 is deleted or duplicated in most cases. Conclusions: Together, these data support a model of rapid gene innovation by fusion of incomplete segmental duplications, altered tissue expression, and potential subfunctionalization or neofunctionalization of HYDIN2 early in the evolution of the Homo lineage

Crossref

Springer - Publisher Connector

Archivio istituzionale della ricerca - Università di Bari

PubMed Central

eScholarship - University of California

KoVariome: Korean National Standard Reference Variome database of whole genomes with comprehensive SNV, indel, CNV, and SV analyses

Author: A McKenna
A Scally
A Telenti
AA Alshatwi
B Charlesworth
C Dong
C Genomes Project
C Loveday
D Hong
D Lakich
D Pinto
D Welter
DE Reich
DG MacArthur
DI Boomsma
EM Shore
FS Collins
GH Perry
H Li
H Stefansson
HP-AS Consortium
J Huddleston
J Jakobsson
J Wang
JI Kim
JR MacDonald
K Chen
K Ye
L Feuk
LP Wong
LT Chen
M Lek
M Nagasaki
MC Hunt
MJ Bamshad
MJ Landrum
ML Bondeson
P Cingolani
P Kraft
PH Sudmant
PH Sudmant
R Ihaka
R Redon
RE Mills
S Besenbacher
S Lee
S Malik
S Purcell
S Tunaru
SA McCarroll
SH Kwak
SM Ahn
ST Sherry
T Mimori
TL Yang
V Boeva
W Zhang
X Wang
YS Cho
YS Ju
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2018
Field of study

High-coverage whole-genome sequencing data of a single ethnicity can provide a useful catalogue of population-specific genetic variations, and provides a critical resource that can be used to more accurately identify pathogenic genetic variants. We report a comprehensive analysis of the Korean population, and present the Korean National Standard Reference Variome (KoVariome). As a part of the Korean Personal Genome Project (KPGP), we constructed the KoVariome database using 5.5 terabases of whole genome sequence data from 50 healthy Korean individuals in order to characterize the benign ethnicity-relevant genetic variation present in the Korean population. In total, KoVariome includes 12.7M single-nucleotide variants (SNVs), 1.7M short insertions and deletions (indels), 4K structural variations (SVs), and 3.6K copy number variations (CNVs). Among them, 2.4M (19%) SNVs and 0.4M (24%) indels were identified as novel. We also discovered selective enrichment of 3.8M SNVs and 0.5M indels in Korean individuals, which were used to filter out 1,271 coding-SNVs not originally removed from the 1,000 Genomes Project when prioritizing disease-causing variants. KoVariome health records were used to identify novel disease-causing variants in the Korean population, demonstrating the value of high-quality ethnic variation databases for the accurate interpretation of individual genomes and the precise characterization of genetic variation

HANYANG Repository

Crossref

Directory of Open Access Journals

ScholarWorks@UNIST

Diversity of human copy number variation and multicopy genes

Author: Abecasis G. R.
Alkan Can
Altshuler D. L.
Antonacci Francesca
Bentley D. R.
Bruhn Laurakay
Chakravarti A.
Clark A. G.
Collins F. S.
De La Vega F. M.
Donnelly P.
Durbin R. M.
Egholm M.
Eichler Evan E.
Flicek P.
Gabriel S. B.
Gibbs R. A.
Kitzman Jacob O.
Knoppers B. M.
Lander E. S.
Lehrach H.
Malig Maika
Mardis E. R.
McVean G. A.
Nickerson D. A.
Peltonen L.
Sampas Nick
Schafer A. J.
Shendure Jay
Sherry S. T.
Sudmant Peter H.
Tsalenko Anya
Wang J.
Wilson R. K.
Publication venue: LSU Digital Commons
Publication date: 29/10/2010
Field of study

Copy number variants affect both disease and normal phenotypic variation, but those lying within heavily duplicated, highly identical sequence have been difficult to assay. By analyzing short-read mapping depth for 159 human genomes, we demonstrated accurate estimation of absolute copy number for duplications as small as 1.9 kilobase pairs, ranging from 0 to 48 copies. We identified 4.1 million singly unique nucleotide positions informative in distinguishing specific copies and used them to genotype the copy and content of specific paralogs within highly duplicated gene families. These data identify human-specific expansions in genes associated with brain development, reveal extensive population genetic diversity, and detect signatures consistent with gene conversion in the human species. Our approach makes ∼1000 genes accessible to genetic studies of disease association

Louisiana State University

Corrigendum: An ethnically relevant consensus Korean reference genome is a step towards personal reference genomes.

Author: A McKenna
AC English
AL Price
B Zhang
C Alkan
C Camacho
C Soderlund
D Earl
D Muddyman
D Reich
DM Church
ES Lander
FE Dewey
G Abrusán
G Benson
G Tosato
GM Church
GS Slater
H Bai
H Cao
H Cao
H Li
J Huddleston
J Jurka
J Wang
JA Bedell
JA Rosenfeld
JR MacDonald
JT Simpson
K Howe
K Prüfer
KD Pruitt
KM Steinberg
L Fan
L Shi
M Pendleton
M Stanke
MJ Chaisson
MJ Chaisson
MJ Landrum
P Cingolani
P Kersbergen
PH Sudmant
R Li
R Li
R Luo
RC McCoy
RE Green
RE Mills
S Gnerre
S Koren
S Levy
S Purcell
S Schiffels
S Sheehan
ST Sherry
W Zhang
WJ Kent
Y Choi
Y Dong
Y Li
Z Jiang
Publication venue: Nat Commun
Publication date: 24/11/2016
Field of study

This corrects the article DOI: 10.1038/ncomms13637

Recommended from our members

The complete genome sequence of a Neandertal from the Altai Mountains

We present a high-quality genome sequence of a Neandertal woman from Siberia. We show that her parents were related at the level of half siblings and that mating among close relatives was common among her recent ancestors. We also sequenced the genome of a Neandertal from the Caucasus to low coverage. An analysis of the relationships and population history of available archaic genomes and 25 present-day human genomes shows that several gene flow events occurred among Neandertals, Denisovans and early modern humans, possibly including gene flow into Denisovans from an unknown archaic group. Thus, interbreeding, albeit of low magnitude, occurred among many hominin groups in the Late Pleistocene. In addition, the high quality Neandertal genome allows us to establish a definitive list of substitutions that became fixed in modern humans after their separation from the ancestors of Neandertals and Denisovans

Harvard University - DASH

PubMed Central

eScholarship - University of California

MPG.PuRe

Copy number variation arising from gene conversion on the human Y chromosome

Author: A Massaia
Anders Bergström
Andrea Massaia
B Trombetta
BL Dumont
Chris Tyler-Smith
CM Carvalho
DJ Turner
DJ Turner
E Bosch
Fengtang Yang
GX Zheng
H Skaletsky
J Wang
JM Chen
JR Lupski
JW Szostak
MA Jobling
MA Jobling
ME Hurles
Michael A. Quail
P Hallast
PH Sudmant
Pille Hallast
PJ Hastings
Qasim Ayub
Ruby Banerjee
S Repping
S Rozen
Sandra Louzada
Steven Leonard
The GTEx Consortium
Wentao Shi
Yali Xue
Yong Gu
Yuan Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref