Search CORE

243 research outputs found

Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and Genome Analyzer systems

Author: Dohm Juliane C
Himmelbauer Heinz
Minoche André E
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

ABSTRACT: BACKGROUND: The generation and analysis of high-throughput sequencing data are becoming a major component of many studies in molecular biology and medical research. Illumina's Genome Analyzer (GA) and HiSeq instruments are currently the most widely used sequencing devices. Here, we comprehensively evaluate properties of genomic HiSeq and GAIIx data derived from two plant genomes and one virus, with read lengths of 95 to 150 bases. RESULTS: We provide quantifications and evidence for GC bias, error rates, error sequence context, effects of quality filtering, and the reliability of quality values. By combining different filtering criteria we reduced error rates 7-fold at the expense of discarding 12.5% of alignable bases. While overall error rates are low in HiSeq data we observed regions of accumulated wrong base calls. Only 3% of all error positions accounted for 24.7% of all substitution errors. Analyzing the forward and reverse strands separately revealed error rates of up to 18.7%. Insertions and deletions occurred at very low rates on average but increased to up to 2% in homopolymers. A positive correlation between read coverage and GC content was found depending on the GC content range. CONCLUSIONS: The errors and biases we report have implications for the use and the interpretation of Illumina sequencing data. GAIIx and HiSeq data sets show slightly different error profiles. Quality filtering is essential to minimize downstream analysis artifacts. Supporting previous recommendations, the strand-specificity provides a criterion to distinguish sequencing errors from low abundance polymorphisms

CiteSeerX

Crossref

Springer - Publisher Connector

PubMed Central

MPG.PuRe

Diagnostic applications of next generation sequencing: working towards quality standards

Author: Adey
Andrea Gehring
Anna Benet-Pagès
Bainbridge
Bentley
Carsten Bergmann
Clark
Clement
Ding
Dohm
Gerlinger
Gregory
Greif
Greif
Hanno Jörn Bolz
Hanns-Georg Klein
Harismendy
Ina Vogl
Jiang
Johansson
Kaimo Hirv
Kalari
Klaus H. Metzeler
Koboldt
Li
Lister
Loman
Mamanova
Manfred Stuhrmann
Mardis
Marius Kuhn
Mertes
Meyerson
Minoche
Nakamura
Philipp A. Greif
Robinson
Rothberg
Saskia Biskup
Sebastian H. Eck
Shendure
Stefan Kotschote
Stratton
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2012
Field of study

Over the past 6 years, next generation sequencing (NGS) has been established as a valuable high-throughput method for research in molecular genetics and has successfully been employed in the identification of rare and common genetic variations. All major NGS technology companies providing commercially available instruments (Roche 454, Illumina, Life Technologies) have recently marketed bench top sequencing instruments with lower throughput and shorter run times, thereby broadening the applications of NGS and opening the technology to the potential use for clinical diagnostics. Although the high expectations regarding the discovery of new diagnostic targets and an overall reduction of cost have been achieved, technological challenges in instrument handling, robustness of the chemistry and data analysis need to be overcome. To facilitate the implementation of NGS as a routine method in molecular diagnostics, consistent quality standards need to be developed. Here the authors give an overview of the current standards in protocols and workflows and discuss possible approaches to define quality criteria for NGS in molecular genetic diagnostics

Crossref

Open Access LMU

PuSH

Dissect: detection and characterization of novel structural alterations in transcribed sequences

Author: Brassesco
Brudno
Burge
B secke
C. C. Collins
Caudevilla
D. Yorukoglu
De Braekeleer
F. Hach
Frantz
Gingeras
Hach
Horiuchi
I. Birol
Kidd
L. Swanson
Labrador
Levin
McPherson
Miller
Minoche
Mott
Nacu
S. C. Sahinalp
Sboner
Slater
Takahashi
Publication venue: Oxford University Press
Publication date: 01/01/2012
Field of study

Motivation: Computational identification of genomic structural variants via high-throughput sequencing is an important problem for which a number of highly sophisticated solutions have been recently developed. With the advent of high-throughput transcriptome sequencing (RNA-Seq), the problem of identifying structural alterations in the transcriptome is now attracting significant attention

DSpace@MIT

Crossref

PubMed Central

Are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors?

Author: Alexandrov
Benson
Bird
Bulmer
Cooper
Derrien
Eyre-Walker
Flicek
Francioli
Fryxell
Gojobori
Harismendy
Harris
Harris
Hodgkinson
Hodgkinson
Hodgkinson
Huang
Hwang
Johnson
Karolchik
Kong
Lawrence
Liu
Lynch
Makova
Martinocorena
Michaelson
Minoche
Nachman
Nazarian
Nelder
Polak
Quail
Rosenfeld
Schrider
Schuster-Bockler
Smith
Treangen
Woo
Zhuang
Publication venue: 'PeerJ'
Publication date: 01/09/2016
Field of study

Across independent cancer genomes it has been observed that some sites have been recurrently hit by single nucleotide variants (SNVs). Such recurrently hit sites might be either (i) drivers of cancer that are postively selected during oncogenesis, (ii) due to mutation rate variation, or (iii) due to sequencing and assembly errors. We have investigated the cause of recurrently hit sites in a dataset of >3 million SNVs from 507 complete cancer genome sequences. We find evidence that many sites have been hit significantly more often than one would expect by chance, even taking into account the effect of the adjacent nucleotides on the rate of mutation. We find that the density of these recurrently hit sites is higher in non-coding than coding DNA and hence conclude that most of them are unlikely to be drivers. We also find that most of them are found in parts of the genome that are not uniquely mappable and hence are likely to be due to mapping errors. In support of the error hypothesis, we find that recurently hit sites are not randomly distributed across sequences from different laboratories. We fit a model to the data in which the rate of mutation is constant across sites but the rate of error varies. This model suggests that ∼4% of all SNVs are errors in this dataset, but that the rate of error varies by thousands-of-fold between sites

Crossref

Directory of Open Access Journals

PubMed Central

Sussex Research Online

Genome sequencing of the extinct Eurasian wild aurochs, Bos primigenius, illuminates the phylogeography and evolution of cattle

Author: A Achilli
A Achilli
A Esteve-Codina
A Gotherstrom
A Seguin-Orlando
A Vaysse
A Winter
AE Minoche
AJ Amaral
Alison Murphy
Amanda J. Lohan
Andrew T. Chamberlain
AV Zimin
B Grisart
B Grisart
BP Lewis
Brendan J. Loftus
C Gamba
C Glaser
Ceiridwen J. Edwards
CG Elsik
Charles Spillane
CJ Edwards
CJ Edwards
CJ Edwards
CJ Rubin
CJ Stevens
CM Leu
CS Troy
D Reich
Daniel G. Bradley
David A. Magee
David E. MacHugh
DE MacHugh
DG Bradley
DM Larkin
DP Toews
E Palkopoulou
E Svensson
EJ McTavish
EY Durand
H Jonsson
H Jonsson
H Li
H Zhang
HD Daetwyler
J Clutton-Brock
J Diamond
J Kantanen
J Lenstra
J Schibler
JA Guerra-Assuncao
JD Vigne
JE Decker
JK Pickrell
JK Pritchard
JS Pedersen
K Prufer
KA Moutou
Kévin Rue-Albrecht
L Orlando
L Perez-Pardal
LK Matukumalli
M Gautier
M Hofreiter
M Li
M Meyer
M Raghavan
M Rasmussen
M Schubert
MA DePristo
MA Greagg
MA Groenen
Mark T. Donoghue
Martin Braud
Matthew D. Teasdale
MJ Montague
N Murakami
N Patterson
NA Rosenberg
O Smith
Paul A. McGettigan
R Bollongino
R Chen
RA Gibbs
RE Green
RE Green
RH Meadow
RR Hudson
RT Loftus
S Bonfiglio
S Bonfiglio
S Bonfiglio
S Guindon
S Koks
S Paabo
S Qanbari
S Sawyer
Shuaishuai Tai
Stephen D E Park
Steven Schroeder
Tad S. Sonstegard
TH Lee
W McLaren
Y Benjamini
Yuan Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Background Domestication of the now-extinct wild aurochs, Bos primigenius, gave rise to the two major domestic extant cattle taxa, B. taurus and B. indicus. While previous genetic studies have shed some light on the evolutionary relationships between European aurochs and modern cattle, important questions remain unanswered, including the phylogenetic status of aurochs, whether gene flow from aurochs into early domestic populations occurred, and which genomic regions were subject to selection processes during and after domestication. Here, we address these questions using whole-genome sequencing data generated from an approximately 6,750-year-old British aurochs bone and genome sequence data from 81 additional cattle plus genome-wide single nucleotide polymorphism data from a diverse panel of 1,225 modern animals. Results Phylogenomic analyses place the aurochs as a distinct outgroup to the domestic B. taurus lineage, supporting the predominant Near Eastern origin of European cattle. Conversely, traditional British and Irish breeds share more genetic variants with this aurochs specimen than other European populations, supporting localized gene flow from aurochs into the ancestors of modern British and Irish cattle, perhaps through purposeful restocking by early herders in Britain. Finally, the functions of genes showing evidence for positive selection in B. taurus are enriched for neurobiology, growth, metabolism and immunobiology, suggesting that these biological processes have been important in the domestication of cattle. Conclusions This work provides important new information regarding the origins and functional evolution of modern cattle, revealing that the interface between early European domestic populations and wild aurochs was significantly more complex than previously thought

Crossref

Springer - Publisher Connector

PubMed Central

Spiral - Imperial College Digital Repository

The University of Manchester - Institutional Repository

Access to Research at National University of Ireland, Galway

University of Huddersfield Repository

Canfam GSD: De novo chromosome-length genome assembly of the German Shepherd Dog (Canis lupus familiaris) using a combination of long reads, optical mapping, and Hi-C

Author: Aiden Erez Lieberman
Ballard J. William O.
Barton Kirston
Bogdanovic Ozren
Chan Eva K. F.
Colaric Zane
Dudchenko Olga
Edwards Richard J.
Field Matt A.
Hayes Vanessa M.
Keilwagen Jens
Lyons Ruth J.
Minoche Andre E.
Omer Arina D.
Rosen Benjamin D.
Skvortsova Ksenia
Smith Martin A.
Smith Timothy P. L.
Tuipulotu Daniel Enosi
Zammit Robert A.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2020
Field of study

Background: The German Shepherd Dog (GSD) is one of the most common breeds on earth and has been bred for its utility and intelligence. It is often first choice for police and military work, as well as protection, disability assistance, and search-and-rescue. Yet, GSDs are well known to be susceptible to a range of genetic diseases that can interfere with their training. Such diseases are of particular concern when they occur later in life, and fully trained animals are not able to continue their duties. Findings: Here, we provide the draft genome sequence of a healthy German Shepherd female as a reference for future disease and evolutionary studies. We generated this improved canid reference genome (CanFam GSD) utilizing a combination of Pacific Bioscience, Oxford Nanopore, 10X Genomics, Bionano, and Hi-C technologies. The GSD assembly is ∼80 times as contiguous as the current canid reference genome (20.9 vs 0.267 Mb contig N50), containing far fewer gaps (306 vs 23,876) and fewer scaffolds (429 vs 3,310) than the current canid reference genome CanFamv3.1. Two chromosomes (4 and 35) are assembled into single scaffolds with no gaps. BUSCO analyses of the genome assembly results show that 93.0% of the conserved single-copy genes are complete in the GSD assembly compared with 92.2% for CanFam v3.1. Homology-based gene annotation increases this value to ∼99%. Detailed examination of the evolutionarily important pancreatic amylase region reveals that there are most likely 7 copies of the gene, indicative of a duplication of 4 ancestral copies and the disruption of 1 copy. Conclusions: GSD genome assembly and annotation were produced with major improvement in completeness, continuity, and quality over the existing canid reference. This resource will enable further research related to canine diseases, the evolutionary relationships of canids, and other aspects of canid biology

ResearchOnline at James Cook University

The Australian dingo is an early offshoot of modern breed dogs

Author: Aiden Erez L.
Ballard J. William O.
Bogdanovic Ozren
Bustamante Sonia
Chan Eva K.F.
Chernoff Barry
Cochran Blake J.
Colaric Zane
Dudchenko Olga
Edwards Richard J.
Esvaran Meera
Field Matt A.
Gilbert M.Thomas P.
Keilwagen Jens
Manandhar Bikash
Melvin Richard G.
Minoche Andre E.
Omer Arina
Rasmussen Jacob Agerbo
Rosen Benjamin D.
Skvortsova Ksenia
Smith Timothy P.L.
Thomas Torsten
Yadav Sonu
Zammit Robert A.
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 01/01/2022
Field of study

Dogs are uniquely associated with human dispersal and bring transformational insight into the domestication process. Dingoes represent an intriguing case within canine evolution being geographically isolated for thousands of years. Here, we present a high-quality de novo assembly of a pure dingo (CanFam_DDS). We identified large chromosomal differences relative to the current dog reference (CanFam3.1) and confirmed no expanded pancreatic amylase gene as found in breed dogs. Phylogenetic analyses using variant pairwise matrices show that the dingo is distinct from five breed dogs with 100% bootstrap support when using Greenland wolf as the outgroup. Functionally, we observe differences in methylation patterns between the dingo and German shepherd dog genomes and differences in serum biochemistry and microbiome makeup. Our results suggest that distinct demographic and environmental conditions have shaped the dingo genome. In contrast, artificial human selection has likely shaped the genomes of domestic breed dogs after divergence from the dingo

ResearchOnline at James Cook University

Copenhagen University Research Information System

PubMed Central

Whole genome sequencing for the genetic diagnosis of heterogenous dystonia phenotypes

Author: Chang Florence C. F.
Cowley Mark J.
Darveniza Paul
Davis Ryan L.
Drew Alex
Fung Victor S. C.
Gayevskiy Velimir
Gu Jason
Hayes Michael
Kang Ce
Kotschet Katya
Kumar Kishore R.
Kummerfeld Sarah
Levy Stanley (R20535)
Mahant Neil
Minoche Andre E.
Morales-Briceño Hugo
Ng Karl
Phua C.S.
Rowe Dominic B.
Siow Sue-Faye
Sue Carolyn M.
Tchan Michel C.
Tisch Stephen
Wali G. M.
Wali Gautam
Walls Zachary
Yiannikas Con
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

Introduction: Dystonia is a clinically and genetically heterogeneous disorder and a genetic cause is often difficult to elucidate. This is the first study to use whole genome sequencing (WGS) to investigate dystonia in a large sample of affected individuals. Methods: WGS was performed on 111 probands with heterogenous dystonia phenotypes. We performed analysis for coding and non-coding variants, copy number variants (CNVs), and structural variants (SVs). We assessed for an association between dystonia and 10 known dystonia risk variants. Results: A genetic diagnosis was obtained for 11.7% (13/111) of individuals. We found that a genetic diagnosis was more likely in those with an earlier age at onset, younger age at testing, and a combined dystonia phenotype. We identified pathogenic/likely-pathogenic variants in ADCY5 (n = 1), ATM (n = 1), GNAL (n = 2), GLB1 (n = 1), KMT2B (n = 2), PRKN (n = 2), PRRT2 (n = 1), SGCE (n = 2), and THAP1 (n = 1). CNVs were detected in 3 individuals. We found an association between the known risk variant ARSG rs11655081 and dystonia (p = 0.003). Conclusion: A genetic diagnosis was found in 11.7% of individuals with dystonia. The diagnostic yield was higher in those with an earlier age of onset, younger age at testing, and a combined dystonia phenotype. WGS may be particularly relevant for dystonia given that it allows for the detection of CNVs, which accounted for 23% of the genetically diagnosed cases. © 2019 The Author

Western Sydney ResearchDirect

Sugar Beet BeetMap-3, and Steps to Improve the Genome Assembly and Genome Sequence Annotation (W875)

Author: Dohm Juliane
Himmelbauer Heinz
Holtgräwe Daniela
Kraft Thomas
Minoche Andre E.
Parol-Kryger Roza
Rosleff Sörensen Thomas
Schmidt Thomas
Schneider Jessica
Schulz Britta
Stadermann Kai Bernd
Stracke Ralf
Weisshaar Bernd
Zakrzewski Falk
Publication venue
Publication date: 01/01/2016
Field of study

Weisshaar B, Himmelbauer H, Schmidt T, et al. Sugar Beet BeetMap-3, and Steps to Improve the Genome Assembly and Genome Sequence Annotation (W875). Presented at the Plant and Animal Genome XXIV Conference, San Diego, USA

Publications at Bielefeld University

Population genomics reveals that within-fungus polymorphism is common and maintained in populations of the mycorrhizal fungus Rhizophagus irregularis.

Author: 1000 Genomes Project Consortium
A Colard
AE Minoche
AM Koch
AM Koch
AM Reitzel
B Börstler
B Börstler
BB Larsen
BK Peterson
C Angelard
C Angelard
C Angelard
D Cantu
D Croll
D Laehnemann
D Sanglard
D Scaglione
D Wibberg
E Boon
E Boon
E Paradis
E Tisserant
F Ronquist
Frédéric G Masclaux
G Kuhn
H Kim
H Li
I Ceballos
Ian R Sanders
IR Sanders
J Catchen
J Ropars
JI Hoffman
JK Hane
JS Paul
K Katoh
K Lin
KA Sedzielewska
KJ Emerson
L Munkvold
M Ehinger
M Hijri
M Öpik
Marco Pagni
MGA van der Heijden
MGA van der Heijden
MO Ehinger
N Corradi
N Wang
NA Baird
PA Hohenlohe
Pawel Rosikiewicz
SE Smith
T Jones
T Magoč
Tania Wyss
TL Parchman
V Lange
V Ter-Hovhannisyan
WR Pearson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Arbuscular mycorrhizal (AM) fungi are symbionts of most plants, increasing plant growth and diversity. The model AM fungus Rhizophagus irregularis (isolate DAOM 197198) exhibits low within-fungus polymorphism. In contrast, another study reported high within-fungus variability. Experiments with other R. irregularis isolates suggest that within-fungus genetic variation can affect the fungal phenotype and plant growth, highlighting the biological importance of such variation. We investigated whether there is evidence of differing levels of within-fungus polymorphism in an R. irregularis population. We genotyped 20 isolates using restriction site-associated DNA sequencing and developed novel approaches for characterizing polymorphism among haploid nuclei. All isolates exhibited higher within-isolate poly-allelic single-nucleotide polymorphism (SNP) densities than DAOM 197198 in repeated and non-repeated sites mapped to the reference genome. Poly-allelic SNPs were independently confirmed. Allele frequencies within isolates deviated from diploids or tetraploids, or that expected for a strict dikaryote. Phylogeny based on poly-allelic sites was robust and mirrored the standard phylogeny. This indicates that within-fungus genetic variation is maintained in AM fungal populations. Our results predict a heterokaryotic state in the population, considerable differences in copy number variation among isolates and divergence among the copies, or aneuploidy in some isolates. The variation may be a combination of all of these hypotheses. Within-isolate genetic variation in R. irregularis leads to large differences in plant growth. Therefore, characterizing genomic variation within AM fungal populations is of major ecological importance

Crossref

Serveur académique lausannois

PubMed Central