Search CORE

377 research outputs found

The Diploid Genome Sequence of an Individual Human

Author: Abril Josep F
Axelrod Nelson
Bafna Vineet
Bansal Vikas
Beeson Karen Y
Borman Jon
Busam Dana A
Denisov Gennady
Feuk Lars
Frazier Marvin E
Gill John
Halpern Aaron L
Huang Jiaqi
Kirkness Ewen F
Kravitz Saul A
Levy Samuel
Lin Yuan
MacDonald Jeffrey R
McIntosh Tina C
Ng Pauline C
Pang Andy Wing Chun
Remington Karin A
Rogers Yu-Hui
Scherer Stephen W
Shago Mary
Stockwell Timothy B
Strausberg Robert L
Sutton Granger
Tsiamouri Alexia
Venter J. Craig
Walenz Brian P
Publication venue: Public Library of Science (PLoS)
Publication date: 01/01/2007
Field of study

Presented here is a genome sequence of an individual human. It was produced from ~32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2-206 bp), 292,102 heterozygous insertion/deletion events (indels)(1-571 bp), 559,473 homozygous indels (1-82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information

Public Library of Science (PLOS)

Diposit Digital de la Universitat de Barcelona

ScholarBank@NUS

ESTIMATING GENOME-WIDE COPY NUMBER USING ALLELE SPECIFIC MIXTURE MODELS

Author: A. Iafrate
A.J. Sharp
A.J. Sharp
C. Li
D. Komura
D.A. Hinds
D.A. Peiffer
D.F. Conrad
D.M. Rocke
E. Tuzun
F.S. Collins
G.C. Kennedy
G.R. Bignell
J. Huang
J. Huang
J. Sebat
L. Feuk
N. Rabbee
R. Irizarry
S. Ishikawa
S.A. McCarroll
X. Zhao
Y. Nannya
Publication venue: Collection of Biostatistics Research Archive
Publication date: 25/10/2006
Field of study

Genomic changes such as copy number alterations are thought to be one of the major underlying causes of human phenotypic variation among normal and disease subjects [23,11,25,26,5,4,7,18]. These include chromosomal regions with so-called copy number alterations: instead of the expected two copies, a section of the chromosome for a particular individual may have zero copies (homozygous deletion), one copy (hemizygous deletions), or more than two copies (amplifications). The canonical example is Down syndrome which is caused by an extra copy of chromosome 21. Identification of such abnormalities in smaller regions has been of great interest, because it is believed to be an underlying cause of cancer. More than one decade ago comparative genomic hybridization (CGH)technology was developed to detect copy number changes in a high-throughput fashion. However, this technology only provides a 10 MB resolution which limits the ability to detect copy number alterations spanning small regions. It is widely believed that a copy number alteration as small as one base can have significant downstream effects, thus microarray manufacturers have developed technologies that provide much higher resolution. Unfortunately, strong probe effects and variation introduced by sample preparation procedures have made single-point copy number estimates too imprecise to be useful. CGH arrays use a two-color hybridization, usually comparing a sample of interest to a reference sample, which to some degree removes the probe effect. However, the resolution is not nearly high enough to provide single-point copy number estimates. Various groups have proposed statistical procedures that pool data from neighboring locations to successfully improve precision. However, these procedure need to average across relatively large regions to work effectively thus greatly reducing the resolution. Recently, regression-type models that account for probe-effect have been proposed and appear to improve accuracy as well as precision. In this paper, we propose a mixture model solution specifically designed for single-point estimation, that provides various advantages over the existing methodology. We use a 314 sample database, constructed with public datasets, to motivate and fit models for the conditional distribution of the observed intensities given allele specific copy numbers. With the estimated models in place we can compute posterior probabilities that provide a useful prediction rule as well as a confidence measure for each call. Software to implement this procedure will be available in the Bioconductor oligo packagehttp://www.bioconductor.org)

Crossref

Collection Of Biostatistics Research Archive

On the power and the systematic biases of the detection of chromosomal inversions by paired-end genome sequencing

Author: A Bashir
AA Hoffmann
AJ Iafrate
AM Hillmer
AW Pang
B Zeitouni
C Alkan
CB Krimbas
DC Richter
E Tuzun
F Hormozdiari
F Hormozdiari
H Li
H Stefansson
J Cao
J Sebat
J Wang
JC Roach
JM Kidd
JM Kidd
JO Korbel
JO Korbel
José Ignacio Lucas Lledó
K Chen
KF Manly
KJ McKernan
L Feuk
M Onishi-Seebacher
Mario Cáceres
P Medvedev
PJ Campbell
PJ Stephens
R Xi
S Suzuki
SM Ahn
SS Sindi
T Rausch
Y Jiang
ZD Zhang
Zhanjiang Liu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

One of the most used techniques to study structural variation at a genome level is paired-end mapping (PEM). PEM has the advantage of being able to detect balanced events, such as inversions and translocations. However, inversions are still quite difficult to predict reliably, especially from high-throughput sequencing data. We simulated realistic PEM experiments with different combinations of read and library fragment lengths, including sequencing errors and meaningful base-qualities, to quantify and track down the origin of false positives and negatives along sequencing, mapping, and downstream analysis. We show that PEM is very appropriate to detect a wide range of inversions, even with low coverage data. However, % of inversions located between segmental duplications are expected to go undetected by the most common sequencing strategies. In general, longer DNA libraries improve the detectability of inversions far better than increments of the coverage depth or the read length. Finally, we review the performance of three algorithms to detect inversions -SVDetect, GRIAL, and VariationHunter-, identify common pitfalls, and reveal important differences in their breakpoint precisions. These results stress the importance of the sequencing strategy for the detection of structural variants, especially inversions, and offer guidelines for the design of future genome sequencing projects

Public Library of Science (PLOS)

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori d'Objectes Digitals per a l'Ensenyament la Recerca i la Cultura

Directory of Open Access Journals

PubMed Central

Diposit Digital de Documents de la UAB

Rare copy number variation in cerebral palsy

Author: A MacLennan
A Moreno-De-Luca
A Moreno-De-Luca
A Raj
AJ Verkerk
Alastair MacLennan
Andres Moreno-De-Luca
B Petterson
BJ O'Roak
C Harvard
Catherine Gibson
CG de Kovel
Chloe Shard
Christa Lese Martin
CN Lynex
D Pinto
DP McHale
DP McHale
EM Strijbis
Eric Haan
Evan Eichler
G Cooper
Gai McMichael
HC Mefford
HM Ozgen
J Gardosi
J Molina
JA Veltman
JH Chai
Jillian Nicholl
Jozef Gecz
JR Lupski
KB Nelson
L Feuk
L Potocki
Lam Son Nguyen
M Aza-Carmona
M Garshasbi
M Medina
M Seeger
M Trimborn
ME O'Callaghan
ME O’Callaghan
N Badawi
N Paneth
NJ Wild
P Bauer
P Rosenbaum
PD Evans
R Abou Jamra
R Carrozzo
R Palisano
RR Selzer
S Girirajan
S Girirajan
S Girirajan
Santhosh Girirajan
T Vrijenhoek
W Henke
X Gai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

As per publisher: published online 22 May 2013Recent studies have established the role of rare copy number variants (CNVs) in several neurological disorders but the contribution of rare CNVs to cerebral palsy (CP) is not known. Fifty Caucasian families having children with CP were studied using two microarray designs. Potentially pathogenic, rare (<1% population frequency) CNVs were identified, and their frequency determined, by comparing the CNVs found in cases with 8329 adult controls with no known neurological disorders. Ten of the 50 cases (20%) had rare CNVs of potential relevance to CP; there were a total of 14 CNVs, which were observed in <0.1% (<8/8329) of the control population. Eight inherited from an unaffected mother: a 751-kb deletion including FSCB, a 1.5-Mb duplication of 7q21.13, a 534-kb duplication of 15q11.2, a 446-kb duplication including CTNND2, a 219-kb duplication including MCPH1, a 169-kb duplication of 22q13.33, a 64-kb duplication of MC2R, and a 135-bp exonic deletion of SLC06A1. Three inherited from an unaffected father: a 386-kb deletion of 12p12.2-p12.1, a 234-kb duplication of 10q26.13, and a 4-kb exonic deletion of COPS3. The inheritance was unknown for three CNVs: a 157-bp exonic deletion of ACOX1, a 693-kb duplication of 17q25.3, and a 265-kb duplication of DAAM1. This is the first systematic study of CNVs in CP, and although it did not identify de novo mutations, has shown inherited, rare CNVs involving potentially pathogenic genes and pathways requiring further investigation.Gai McMichael, Santhosh Girirajan, Andres Moreno-De-Luca, Jozef Gecz, Chloe Shard, Lam Son Nguyen, Jillian Nicholl, Catherine Gibson, Eric Haan, Evan Eichler, Christa Lese Martin and Alastair MacLenna

Crossref

Adelaide Research & Scholarship

KoVariome: Korean National Standard Reference Variome database of whole genomes with comprehensive SNV, indel, CNV, and SV analyses

Author: A McKenna
A Scally
A Telenti
AA Alshatwi
B Charlesworth
C Dong
C Genomes Project
C Loveday
D Hong
D Lakich
D Pinto
D Welter
DE Reich
DG MacArthur
DI Boomsma
EM Shore
FS Collins
GH Perry
H Li
H Stefansson
HP-AS Consortium
J Huddleston
J Jakobsson
J Wang
JI Kim
JR MacDonald
K Chen
K Ye
L Feuk
LP Wong
LT Chen
M Lek
M Nagasaki
MC Hunt
MJ Bamshad
MJ Landrum
ML Bondeson
P Cingolani
P Kraft
PH Sudmant
PH Sudmant
R Ihaka
R Redon
RE Mills
S Besenbacher
S Lee
S Malik
S Purcell
S Tunaru
SA McCarroll
SH Kwak
SM Ahn
ST Sherry
T Mimori
TL Yang
V Boeva
W Zhang
X Wang
YS Cho
YS Ju
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2018
Field of study

High-coverage whole-genome sequencing data of a single ethnicity can provide a useful catalogue of population-specific genetic variations, and provides a critical resource that can be used to more accurately identify pathogenic genetic variants. We report a comprehensive analysis of the Korean population, and present the Korean National Standard Reference Variome (KoVariome). As a part of the Korean Personal Genome Project (KPGP), we constructed the KoVariome database using 5.5 terabases of whole genome sequence data from 50 healthy Korean individuals in order to characterize the benign ethnicity-relevant genetic variation present in the Korean population. In total, KoVariome includes 12.7M single-nucleotide variants (SNVs), 1.7M short insertions and deletions (indels), 4K structural variations (SVs), and 3.6K copy number variations (CNVs). Among them, 2.4M (19%) SNVs and 0.4M (24%) indels were identified as novel. We also discovered selective enrichment of 3.8M SNVs and 0.5M indels in Korean individuals, which were used to filter out 1,271 coding-SNVs not originally removed from the 1,000 Genomes Project when prioritizing disease-causing variants. KoVariome health records were used to identify novel disease-causing variants in the Korean population, demonstrating the value of high-quality ethnic variation databases for the accurate interpretation of individual genomes and the precise characterization of genetic variation

HANYANG Repository

Crossref

Directory of Open Access Journals

ScholarWorks@UNIST

"GenotypeColour™": colour visualisation of SNPs and CNVs

Author: BA Weir
Chiara Magri
JC Ting
L Feuk
M Lin
Sergio Barlati
Sergio Chiesa
Y Nannya
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The volume of data available on genetic variations has increased considerably with the recent development of high-density, single-nucleotide polymorphism (SNP) arrays. Several software programs have been developed to assist researchers in the analysis of this huge amount of data, but few can rely upon a whole genome variability visualisation system that could help data interpretation. Results We have developed <it>GenotypeColour™ </it>as a rapid user-friendly tool able to upload, visualise and compare the huge amounts of data produced by Affymetrix Human Mapping GeneChips without losing the overall view of the data. Some features of <it>GenotypeColour™ </it>include visualising the entire genome variability in a single screenshot for one or more samples, the simultaneous display of the genotype and Copy Number state for thousands of SNPs, and the comparison of large amounts of samples by producing "consensus" images displaying regions of complete or partial identity. The software is also useful for genotype analysis of trios and to show regions of potential uniparental disomy (UPD). All information can then be exported in a tabular format for analysis with dedicated software. At present, the software can handle data from 10 K, 100 K, 250 K, 5.0 and 6.0 Affymetrix chips. Conclusion We have created a software that offers a new way of displaying and comparing SNP and CNV genomic data. The software is available free at <url>http://www.med.unibs.it/~barlati/GenotypeColour</url> and is especially useful for the analysis of multiple samples.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Università di Brescia

Correction: Exome Sequencing in an Admixed Isolated Population IndicatesNFXL1 Variants Confer a Risk for Specific Language Impairment

Author: A Clark
A Gialluisi
A Kong
AC Cummings
AC Cummings
AJ Whitehouse
AJ Whitehouse
Alexander Hoischen
Anne O’Hare
B Bakkaloglu
B Peter
B St Pourcain
BJ O’Roak
Brett S. Abrahams
C Bourgain
C Gilissen
C Mussig
C Zweier
CA Dollaghan
Christian Gilissen
Clyde Francks
CS Lai
CS Leblond
D Horn
D Nyholt
D Wechsler
DF Newbury
DF Newbury
Dianne F. Newbury
DV Bishop
DV Bishop
E De Renzi
E Spiteri
Elizabeth R. Hennessy
EM Semel
F Ceroni
G Conti-Ramsden
G Lunter
Gillian Baird
Gina Conti-Ramsden
GR Abecasis
GT Marth
HC Whalley
Hernán Palomino
I Mathieson
J Law
JD Eicher
Jean-Baptiste Cazier
JM Schwarz
Joris A. Veltman
KD MacDermot
L Feuk
LD Shriberg
Lillian Jara
LM Bedore
Luis Carvajal-Carmona
M Falcaro
M Kamal
M Kos
M Luciano
M Pavez
M Xu
MA Rivas
Maria Magdalena Echeverry
María Angélica Fernández
MB Stein
MM Pavez
NH Simpson
Nuala H. Simpson
P Cingolani
P Rodenas-Cuadrado
P Tallal
P Villanueva
P Villanueva
P Villanueva
P Villanueva
Patrick F. Bolton
Pía Villanueva
R Chaerkady
R Nudel
R Nudel
RM Durbin
Ron Nudel
Rose H. Reader
S Girirajan
S Harel
S Purcell
S Rozen
S Zeesman
SC Vernes
SC Vernes
SE Fisher
Simon E. Fisher
T Thornton
T Thornton
TS Scerri
V Burden
W Shu
W Tang
Z Song
Zulema De Barbieri
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

Children affected by Specific Language Impairment (SLI) fail to acquire age appropriate language skills despite adequate intelligence and opportunity. SLI is highly heritable, but the understanding of underlying genetic mechanisms has proved challenging. In this study, we use molecular genetic techniques to investigate an admixed isolated founder population from the Robinson Crusoe Island (Chile), who are affected by a high incidence of SLI, increasing the power to discover contributory genetic factors. We utilize exome sequencing in selected individuals from this population to identify eight coding variants that are of putative significance. We then apply association analyses across the wider population to highlight a single rare coding variant (rs144169475, Minor Allele Frequency of 4.1% in admixed South American populations) in the NFXL1 gene that confers a nonsynonymous change (N150K) and is significantly associated with language impairment in the Robinson Crusoe population (p = 2.04 × 10–4, 8 variants tested). Subsequent sequencing of NFXL1 in 117 UK SLI cases identified four individuals with heterozygous variants predicted to be of functional consequence. We conclude that coding variants within NFXL1 confer an increased risk of SLI within a complex genetic model

Aberdeen University Research

CLoK

Crossref

University of Birmingham Research Portal

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Radboud Repository (Radboud Univ.)

Repositorio Académico de la Universidad de Chile

MPG.PuRe

The Francis Crick Institute

An enhanced method for targeted next generation sequencing copy number variant detection using ExomeDepth [version 1; peer review: 1 approved, 1 approved with reservations]

Author: C Watson
J Schouten
L Feuk
M Zarrei
M Zhao
S Teo
V Plagnol
Publication venue: 'F1000 Research Ltd'
Publication date: 14/07/2017
Field of study

Copy number variants (CNV) are a major cause of disease, with over 30,000 reported in the DECIPHER database. To use read depth data from targeted Next Generation Sequencing (NGS) panels to identify CNVs with the highest degree of sensitivity, it is necessary to account for biases inherent in the data. GC content and ambiguous mapping due to repetitive sequence elements and pseudogenes are the principal components of technical variability. In addition, the algorithms used favour the detection of multi-exon CNVs, and rely on suitably matched normal dosage samples for comparison. We developed a calling strategy that subdivides target intervals, and uses pools of historical control samples to overcome these limitations in a clinical diagnostic laboratory. We compared our enhanced strategy with an unmodified pipeline using the R software package ExomeDepth, using a cohort of 109 heterozygous CNVs (91 deletions, 18 duplications in 26 genes), including 25 single exon CNVs. The unmodified pipeline detected 104/109 CNVs, giving a sensitivity of 89.62% to 98.49% at the 95% confidence interval. The detection of all 109 CNVs by our enhanced method demonstrates 95% confidence the sensitivity is ≥96.67%, allowing NGS read depth analysis to be used for CNV detection in a clinical diagnostic setting

RD&E Research Repository

Crossref

Royal Devon and Exeter Research Repository

White Rose Research Online

Genome-Wide Association Study of Copy Number Variants Suggests LTBP1 and FGD4 Are Important for Alcohol Drinking

Author: A Berney
A Gualandris
AC Grobin
AC Heath
AJ Ridley
AL Price
B Xu
BE Stranger
BS Saltzman
CA Prescott
CA Prescott
CN Henrichsen
DM Dick
DQ Nguyen
E Gonzalez
G Merla
G Schumann
HJ Edenberg
Hong-Wen Deng
Hui Shen
IS Consortium
J Gelernter
J Sebat
JB Whitfield
JD Grant
Jian Li
JL Freeman
JM Korn
JR Lupski
JT Glessner
KA Fisher
KS Kendler
KS Kendler
L Ciuclan
L Feuk
L Feuk
L Zuo
Lei Zhang
M Economou
M Kneussel
P Cahan
P Ibanez
PR Buckland
Qing Tian
R Redon
Rong Hai
S D'Alfonso
S Repping
SA McCarroll
Shu Ran
SL Liu
T Foroud
T Walsh
Tie-Lin Yang
Weili Zhang
X Luo
X Zhu
Xingguang Luo
Xue-Zhen Zhu
Yingying Han
Yu-Fang Pei
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Alcohol dependence (AD) is a complex disorder characterized by psychiatric and physiological dependence on alcohol. AD is reflected by regular alcohol drinking, which is highly inheritable. In this study, to identify susceptibility genes associated with alcohol drinking, we performed a genome-wide association study of copy number variants (CNVs) in 2,286 Caucasian subjects with Affymetrix SNP6.0 genotyping array. We replicated our findings in 1,627 Chinese subjects with the same genotyping array. We identified two CNVs, CNV207 (combined p-value 1.91E-03) and CNV1836 (combined p-value 3.05E-03) that were associated with alcohol drinking. CNV207 and CNV1836 are located at the downstream of genes LTBP1 (870 kb) and FGD4 (400 kb), respectively. LTBP1, by interacting TGFB1, may down-regulate enzymes directly participating in alcohol metabolism. FGD4 plays a role in clustering and trafficking GABAA receptor and subsequently influence alcohol drinking through activating CDC42. Our results provide suggestive evidence that the newly identified CNV regions and relevant genes may contribute to the genetic mechanism of alcohol dependence

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS

The Francis Crick Institute

An examination of the Apo-1/Fas promoter Mva I polymorphism in Japanese patients with multiple sclerosis

Author: BG Weinshenker
DH Lynch
F Leithauser
GC Ebers
Ichiro Yabe
J Inazawa
JL Haines
Kunio Tashiro
L Feuk
M Leverkus
M Schmied
Masaaki Niino
MP Pender
P Lichter
QR Huang
QR Huang
QR Huang
Ryuji Miyagishi
Seiji Kikuchi
T Fukazawa
T Fukazawa
T Suda
Toshiyuki Fukazawa
WI McDonald
YH Lee
Publication venue: BioMed Central
Publication date: 01/01/2002
Field of study

BACKGROUND: The Apo-1/Fas (CD95) molecule is an apoptosis-signaling cell surface receptor belonging to the tumor necrosis factor (TNF) receptor family. Both Fas and Fas ligand (FasL) are expressed in activated mature T cells, and prolonged cell activation induces susceptibility to Fas-mediated apoptosis. The Apo-1/Fas gene is located in a chromosomal region that shows linkage in multiple sclerosis (MS) genome screens, and studies indicate that there is aberrant expression of the Apo-1/Fas molecule in MS. METHODS: Mva I polymorphism on the Apo-1/Fas promoter gene was detected by PCR-RFLP from the DNA of 114 Japanese patients with conventional MS and 121 healthy controls. We investigated the association of the Mva I polymorphism in Japanese MS patients using a case-control association study design. RESULTS: We found no evidence that the polymorphism contributes to susceptibility to MS. Furthermore, there was no association between Apo-1/Fas gene polymorphisms and clinical course (relapsing-remitting course or secondary-progressive course). No significant association was observed between Apo-1/Fas gene polymorphisms and the age at disease onset. CONCLUSIONS: Overall, our findings suggest that Apo-1/Fas promoter gene polymorphisms are not conclusively related to susceptibility to MS or the clinical characteristics of Japanese patients with MS

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central