Search CORE

75 research outputs found

Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects

Author: et al
Hall Ira M
Larson David E
Regier Allison A
Publication venue: Digital Commons@Becker
Publication date: 01/01/2018
Field of study

A strategy for building and using a human reference pangenome

Author: et al
Llamas Bastien
Regier Allison
Publication venue: Digital Commons@Becker
Publication date: 01/01/2019
Field of study

In March 2019, 45 scientists and software engineers from around the world converged at the University of California, Santa Cruz for the first pangenomics codeathon. The purpose of the meeting was to propose technical specifications and standards for a usable human pangenome as well as to build relevant tools for genome graph infrastructures. During the meeting, the group held several intense and productive discussions covering a diverse set of topics, including advantages of graph genomes over a linear reference representation, design of new methods that can leverage graph-based data structures, and novel visualization and annotation approaches for pangenomes. Additionally, the participants self-organized themselves into teams that worked intensely over a three-day period to build a set of pipelines and tools for specific pangenomic applications. A summary of the questions raised and the tools developed are reported in this manuscript

Digital Commons@Becker

High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios

Author: Abel Haley J
Byrska-Bishop Marta
et al.
Hall Ira M
Regier Allison A
Publication venue: 'Elsevier BV'
Publication date: 01/09/2022
Field of study

The 1000 Genomes Project (1kGP) is the largest fully open resource of whole-genome sequencing (WGS) data consented for public distribution without access or use restrictions. The final, phase 3 release of the 1kGP included 2,504 unrelated samples from 26 populations and was based primarily on low-coverage WGS. Here, we present a high-coverage 3,202-sample WGS 1kGP resource, which now includes 602 complete trios, sequenced to a depth of 30X using Illumina. We performed single-nucleotide variant (SNV) and short insertion and deletion (INDEL) discovery and generated a comprehensive set of structural variants (SVs) by integrating multiple analytic methods through a machine learning model. We show gains in sensitivity and precision of variant calls compared to phase 3, especially among rare SNVs as well as INDELs and SVs spanning frequency spectrum. We also generated an improved reference imputation panel, making variants discovered here accessible for association studies

Digital Commons@Becker

A draft human pangenome reference

Author: Abel Haley J
Antonacci-Fulton Lucinda L
Cody Sarah
et al.
Fulton Robert S
Liao Wen-Wei
Regier Allison A
Tomlinson Chad
Wang Ting
Publication venue: Digital Commons@Becker
Publication date: 01/05/2023
Field of study

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individual

Digital Commons@Becker

Mapping and characterization of structural variation in 17,795 human genomes

Author: Abel Haley J.
Daly Mark J.
Hall Ira M.
Larson David E.
NHGRI Ctr Common
Palotie Aarno
Regier Allison A.
Ripatti Samuli
Salomaa Veikko
Taskinen Marja-Riitta
Publication venue
Publication date: 02/07/2020
Field of study

Structural variants in more than 17,000 human genomes are mapped and characterized using whole-genome sequencing, showing how this type of variation contributes to rare deleterious coding and noncoding alleles. A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline(1)to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Breakpoint structure of the Anopheles gambiae 2Rb chromosomal inversion

Author: Besansky Nora J
Bretz David A
Collins Frank H
Costantini Carlo
Emrich Scott J
Lobo Neil F
Regier Allison A
Reidenbach Kyanne R
Sangaré Djibril M
Sharakhova Maria V
Traore Sekou F
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Alternative arrangements of chromosome 2 inversions in <it>Anopheles gambiae </it>are important sources of population structure, and are associated with adaptation to environmental heterogeneity. The forces responsible for their origin and maintenance are incompletely understood. Molecular characterization of inversion breakpoints provides insight into how they arose, and provides the basis for development of molecular karyotyping methods useful in future studies. Methods Sequence comparison of regions near the cytological breakpoints of 2Rb allowed the molecular delineation of breakpoint boundaries. Comparisons were made between the standard 2R<it>+</it><it>b </it>arrangement in the <it>An. gambiae </it>PEST reference genome and the inverted 2R<it>b </it>arrangements in the <it>An. gambiae </it>M and S genome assemblies. Sequence differences between alternative 2R<it>b </it>arrangements were exploited in the design of a PCR diagnostic assay, which was evaluated against the known chromosomal banding pattern of laboratory colonies and field-collected samples from Mali and Cameroon. Results The breakpoints of the 7.55 Mb 2R<it>b </it>inversion are flanked by extensive runs of the same short (72 bp) tandemly organized sequence, which was likely responsible for chromosomal breakage and rearrangement. Application of the molecular diagnostic assay suggested that 2R<it>b </it>has a single common origin in <it>An. gambiae </it>and its sibling species, <it>Anopheles arabiensis</it>, and also that the standard arrangement (2R<it>+</it><it>b</it>) may have arisen twice through breakpoint reuse. The molecular diagnostic was reliable when applied to laboratory colonies, but its accuracy was lower in natural populations. Conclusions The complex repetitive sequence flanking the 2R<it>b </it>breakpoint region may be prone to structural and sequence-level instability. The 2R<it>b </it>molecular diagnostic has immediate application in studies based on laboratory colonies, but its usefulness in natural populations awaits development of complementary molecular tools.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Horizon / Pleins textes

Mitochondrial genome copy number measured by DNA sequencing in human blood is strongly associated with metabolic traits via cell-type composition differences

Author: Abel Haley
Chen Lei
Christ Ryan
Das Indraniel
et al
Ganel Liron
Hall Ira M
Kanchi Krishna
Kang Chul Joo
Larson David
Locke Adam
Regier Allison
Scott Alexandra
Stitziel Nathan O
Young Erica
Publication venue: Digital Commons@Becker
Publication date: 07/06/2021
Field of study

BACKGROUND: Mitochondrial genome copy number (MT-CN) varies among humans and across tissues and is highly heritable, but its causes and consequences are not well understood. When measured by bulk DNA sequencing in blood, MT-CN may reflect a combination of the number of mitochondria per cell and cell-type composition. Here, we studied MT-CN variation in blood-derived DNA from 19184 Finnish individuals using a combination of genome (N = 4163) and exome sequencing (N = 19034) data as well as imputed genotypes (N = 17718). RESULTS: We identified two loci significantly associated with MT-CN variation: a common variant at the MYB-HBS1L locus (P = 1.6 × 10 CONCLUSION: These results suggest that measurements of MT-CN in blood-derived DNA partially reflect differences in cell-type composition and that these differences are causally linked to insulin and related traits

Digital Commons@Becker

Mitochondrial genome copy number measured by DNA sequencing in human blood is strongly associated with metabolic traits via cell-type composition differences

Author: Abel Haley
Boehnke Michael
Chen Lei
Chiang Charleston W. K.
Christ Ryan
Das Indraniel
Freimer Nelson
Ganel Liron
Hall Ira M.
Havulinna Aki
Kanchi Krishna
Kang Chul Joo
Kuusisto Johanna
Laakso Markku
Larson David
Locke Adam
Palotie Aarno
Regier Allison
Ripatti Samuli
Scott Alexandra
Service Susan
Stitziel Nathan O.
Vangipurapu Jagadish
Young Erica
Publication venue
Publication date: 01/06/2021
Field of study

Background Mitochondrial genome copy number (MT-CN) varies among humans and across tissues and is highly heritable, but its causes and consequences are not well understood. When measured by bulk DNA sequencing in blood, MT-CN may reflect a combination of the number of mitochondria per cell and cell-type composition. Here, we studied MT-CN variation in blood-derived DNA from 19184 Finnish individuals using a combination of genome (N = 4163) and exome sequencing (N = 19034) data as well as imputed genotypes (N = 17718). Results We identified two loci significantly associated with MT-CN variation: a common variant at the MYB-HBS1L locus (P = 1.6 x 10(-8)), which has previously been associated with numerous hematological parameters; and a burden of rare variants in the TMBIM1 gene (P = 3.0 x 10(-8)), which has been reported to protect against non-alcoholic fatty liver disease. We also found that MT-CN is strongly associated with insulin levels (P = 2.0 x 10(-21)) and other metabolic syndrome (metS)-related traits. Using a Mendelian randomization framework, we show evidence that MT-CN measured in blood is causally related to insulin levels. We then applied an MT-CN polygenic risk score (PRS) derived from Finnish data to the UK Biobank, where the association between the PRS and metS traits was replicated. Adjusting for cell counts largely eliminated these signals, suggesting that MT-CN affects metS via cell-type composition. Conclusion These results suggest that measurements of MT-CN in blood-derived DNA partially reflect differences in cell-type composition and that these differences are causally linked to insulin and related traits.Peer reviewe

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Helsingin yliopiston digitaalinen arkisto

Deep Blue Documents at the University of Michigan

High-throughput 454 resequencing for allele discovery and recombination mapping in Plasmodium falciparum

Author: Allison Regier
AR Quinlan
Asako Tan
Brendan Collins
Brian A Desany
DA Wheeler
DE Neafsey
DJ Begun
DL Hyten
E Mancera
E Martinez-Perez
E Novaes
ER Mardis
F Picard
H Jiang
HH Chou
I Kozarewa
J Qi
J Ragoussis
J San Filippo
JA Bailey
JC Tan
JM Chen
JM Rothberg
John C Tan
KE Holt
KL McNally
KV Voelkerding
M Margulies
M Shinohara
MA West
Michael T Ferdig
MJ Gardner
MJ Moore
MZ Man
NJ van Orsouw
NV Dharia
O Harismendy
PJ Campbell
R Li
RA Holt
RR Selzer
RS Malhi
Scott J Emrich
SKK Volkman
T Singer
T Wicker
TE Wellems
Upeka Samarakoon
W Brockman
W Huang
WB Barbazuk
X Huang
X Su
Y Shen
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Knowledge of the origins, distribution, and inheritance of variation in the malaria parasite (<it>Plasmodium falciparum</it>) genome is crucial for understanding its evolution; however the 81% (A+T) genome poses challenges to high-throughput sequencing technologies. We explore the viability of the Roche 454 Genome Sequencer FLX (GS FLX) high throughput sequencing technology for both whole genome sequencing and fine-resolution characterization of genetic exchange in malaria parasites. Results We present a scheme to survey recombination in the haploid stage genomes of two sibling parasite clones, using whole genome pyrosequencing that includes a sliding window approach to predict recombination breakpoints. Whole genome shotgun (WGS) sequencing generated approximately 2 million reads, with an average read length of approximately 300 bp. <it>De novo </it>assembly using a combination of WGS and 3 kb paired end libraries resulted in contigs ≤ 34 kb. More than 8,000 of the 24,599 SNP markers identified between parents were genotyped in the progeny, resulting in a marker density of approximately 1 marker/3.3 kb and allowing for the detection of previously unrecognized crossovers (COs) and many non crossover (NCO) gene conversions throughout the genome. Conclusions By sequencing the 23 Mb genomes of two haploid progeny clones derived from a genetic cross at more than 30× coverage, we captured high resolution information on COs, NCOs and genetic variation within the progeny genomes. This study is the first to resequence progeny clones to examine fine structure of COs and NCOs in malaria parasites.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Recommended from our members

Subgrouping the autism "spectrum": reflections on DSM-5

Author: A Frances
A Ronald
A Taheri
AN Witwer
AO Caglayan
B Chakrabarti
B Devlin
Bhismadev Chakrabarti
C Allison
C Fountain
C Lintas
C Lord
CE Wilson
CW Nordahl
DA Regier
DH Geschwind
DM Werling
E Billstedt
E Schwarz
E Sucksmith
EB Robinson
Eric Nestler
F Happe
FD Beacher
GA Stefanatos
J Moss
JC McPartland
JG Williams
JL Matson
JL Matson
JN Constantino
K Dworzynski
K Gotham
K Greaves-Lord
KM Heil
L Cosgrove
L Wing
M Huerta
M Moran
M Rutter
M-C Lai
M-C Lai
MB Posserud
Meng-Chuan Lai
Michael V. Lombardo
ML Mattila
NJ Rinehart
P Howlin
P Szatmari
S Baron-Cohen
S Baron-Cohen
S Baron-Cohen
S Kapur
S Lundstrom
S Ozonoff
S Wheelwright
SD Mayes
SE Swedo
Simon Baron-Cohen
SR Gilman
T Insel
TW Frazier
U Frith
V Gibbs
W Mandy
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 23/04/2013
Field of study

DSM-5 has moved autism from the level of subgroups ("apples and oranges") to the prototypical level ("fruit"). But making progress in research, and ultimately improving clinical practice, will require identifying subgroups within the autism spectrum

Central Archive at the University of Reading

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

National Taiwan University Repository