Search CORE

eScholarship - University of California

Construction and characterization of a bacterial artificial chromosome (BAC) library for the A genome of wheat

Author: B. Keller
D. Lijavetzky
G. Muzzi
J. Dubcovsky
R. Wing
T. Wicker
Publication venue: 'Canadian Science Publishing'
Publication date: 01/01/2002
Field of study

Identification of potential transcriptionally active Copia LTR retrotransposons in Eucalyptus

Author: Celso Marino
CM Vicient
Douglas Domingues
EM McCarthy
F Zhou
Helena Marcon
J Du
J Jurka
K Tamura
M Bacci Júnior
R Vicentini
T Wicker
T Wicker
Y L’Homme
Z Xu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

The Repetitive Landscape of the Barley Genome

Author: A d’Hont
A Suoniemi
AH Schulman
CD Hirsch
CM Vicient
CP Middleton
D Chalopin
EA Gladyshev
EV Leushkin
F Kempken
F Sabot
F Sabot
H Abe
HS Malik
I Manninen
IJ Leitch
International Brachypodium Initiative
JA Tanskanen
JP Buchmann
L Yang
LJ Kelly
M Jääskeläinen
M Jääskeläinen
M Mascher
P Neumann
R Kalendar
S Hudakova
S Roffler
T Bureau
T Bureau
T Wicker
T Wicker
T Wicker
T Wicker
VV Kapitonov
W Chang
W Chang
Y Han
Publication venue: Springer
Publication date: 19/08/2018
Field of study

While transposable elements (TEs) comprise the bulk of plant genomic DNA, how they contribute to genome structure and organization is still poorly understood. Especially, in large genomes where TEs make the majority of genomic DNA, it is still unclear whether TEs target specific chromosomal regions or whether they simply accumulate where they are best tolerated. The barley genome with its vast repetitive fraction is an ideal system to study chromosomal organization and evolution of TEs. Genes make only about 2% of the genome, while over 80% is derived from TEs. The TE fraction is composed of at least 350 different families. However, 50% of the genome is comprised of only 15 high-copy TE families, while all other TE families are present in moderate or low-copy numbers. The barley genome is highly compartmentalized with different types of TEs occupying different chromosomal “niches”, such as distal, interstitial or proximal regions of chromosome arms. Furthermore, gene space represents its own distinct genomic compartment that is enriched in small non-autonomous DNA transposons, suggesting that these TEs specifically target promoters and downstream regions. Some TE families also show a strong preference to insert in specific sequence motifs which may, in part, explain their distribution. The family-specific distribution patterns result in distinct TE compositions of different chromosomal compartments.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Sequencing of BAC pools by different next generation sequencing platforms and strategies

Author: Andreas Petzold
Arabidopsis_Genome_Initiative
B Ewing
B Steuernagel
BJ Zonneveld
Burkhard Steuernagel
C Feuillet
D Schulte
Daniela Schulte
IRGSP
JC Dohm
K Eversole
Klaus FX Mayer
M Meyer
Marco Groth
Marius Felder
Matthias Platzer
Nils Stein
RK Varshney
Ruvini Ariyadasa
S Kurtz
S Rounsley
SA Goff
Stefan Taudien
T Wicker
T Wicker
T Wicker
Thomas Schmutzer
Uwe Scholz
VM Gonzalez
Y Palti
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Next generation sequencing of BACs is a viable option for deciphering the sequence of even large and highly repetitive genomes. In order to optimize this strategy, we examined the influence of read length on the quality of Roche/454 sequence assemblies, to what extent Illumina/Solexa mate pairs (MPs) improve the assemblies by scaffolding and whether barcoding of BACs is dispensable. Results Sequencing four BACs with both FLX and Titanium technologies revealed similar sequencing accuracy, but showed that the longer Titanium reads produce considerably less misassemblies and gaps. The 454 assemblies of 96 barcoded BACs were improved by scaffolding 79% of the total contig length with MPs from a non-barcoded library. Assembly of the unmasked 454 sequences without separation by barcodes revealed chimeric contig formation to be a major problem, encompassing 47% of the total contig length. Masking the sequences reduced this fraction to 24%. Conclusion Optimal BAC pool sequencing should be based on the longest available reads, with barcoding essential for a comprehensive assessment of both repetitive and non-repetitive sequence information. When interest is restricted to non-repetitive regions and repeats are masked prior to assembly, barcoding is non-essential. In any case, the assemblies can be improved considerably by scaffolding with non-barcoded BAC pool MPs.</p

Directory of Open Access Journals

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

PuSH

Repeat-length variation in a wheat cellulose synthase-like gene is associated with altered tiller number and stem cell wall composition

Author: Berges H.
Breen J.
Hyles J.
MacMillan C.
Pettolino F.
Spielmeyer W.
Stachurski Z.
Vautrin S.
Wicker T.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

The tiller inhibition gene (tin) that reduces tillering in wheat (Triticum aestivum) is also associated with large spikes, increased grain weight, and thick leaves and stems. In this study, comparison of near-isogenic lines (NILs) revealed changes in stem morphology, cell wall composition, and stem strength. Microscopic analysis of stem cross-sections and chemical analysis of stem tissue indicated that cell walls in tin lines were thicker and more lignified than in free-tillering NILs. Increased lignification was associated with stronger stems in tin plants. A candidate gene for tin was identified through map-based cloning and was predicted to encode a cellulose synthase-like (Csl) protein with homology to members of the CslA clade. Dinucleotide repeat-length polymorphism in the 5′UTR region of the Csl gene was associated with tiller number in diverse wheat germplasm and linked to expression differences of Csl transcripts between NILs. We propose that regulation of Csl transcript and/or protein levels affects carbon partitioning throughout the plant, which plays a key role in the tin phenotype.J. Hyles, S. Vautrin, F. Pettolino, C. MacMillan, Z. Stachurski, J. Breen, H. Berges, T. Wicker, and W. Spielmeye

Adelaide Research & Scholarship

HAL Descartes

ZORA

ProdInra

Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats

Author: Graner A.
Narechania A.
Sabot F.
Stein J.
Stein N.
Vu Thi-Ha-Giang
Ware D.
Wicker T.
Publication venue
Publication date: 01/01/2008
Field of study

BACKGROUND: Barley has one of the largest and most complex genomes of all economically important food crops. The rise of new short read sequencing technologies such as Illumina/Solexa permits such large genomes to be effectively sampled at relatively low cost. Based on the corresponding sequence reads a Mathematically Defined Repeat (MDR) index can be generated to map repetitive regions in genomic sequences. RESULTS: We have generated 574 Mbp of Illumina/Solexa sequences from barley total genomic DNA, representing about 10% of a genome equivalent. From these sequences we generated an MDR index which was then used to identify and mark repetitive regions in the barley genome. Comparison of the MDR plots with expert repeat annotation drawing on the information already available for known repetitive elements revealed a significant correspondence between the two methods. MDR-based annotation allowed for the identification of dozens of novel repeat sequences, though, which were not recognised by hand-annotation. The MDR data was also used to identify gene-containing regions by masking of repetitive sequences in eight de-novo sequenced bacterial artificial chromosome (BAC) clones. For half of the identified candidate gene islands indeed gene sequences could be identified. MDR data were only of limited use, when mapped on genomic sequences from the closely related species Triticum monococcum as only a fraction of the repetitive sequences was recognised. CONCLUSION: An MDR index for barley, which was obtained by whole-genome Illumina/Solexa sequencing, proved as efficient in repeat identification as manual expert annotation. Circumventing the labour-intensive step of producing a specific repeat library for expert annotation, an MDR index provides an elegant and efficient resource for the identification of repetitive and low-copy (i.e. potentially gene-containing sequences) regions in uncharacterised genomic sequences. The restriction that a particular MDR index can not be used across species is outweighed by the low costs of Illumina/Solexa sequencing which makes any chosen genome accessible for whole-genome sequence sampling

Aberystwyth Research Portal

Cold Spring Harbor Laboratory Institutional Repository

Directory of Open Access Journals

University of Regensburg Publication Server

ZORA

Substantial biases in ultra-short read data sets from high-throughput DNA sequencing

Author: C. Lottaz
Cheung
H. Himmelbauer
Huse
J. C. Dohm
Kim
Multer
Ng
Robertson
Sanger
Siddiqui
T. Borodina
Whiteford
Wicker
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

Novel sequencing technologies permit the rapid production of large sequence data sets. These technologies are likely to revolutionize genetics and biomedical research, but a thorough characterization of the ultra-short read output is necessary. We generated and analyzed two Illumina 1G ultra-short read data sets, i.e. 2.8 million 27mer reads from a Beta vulgaris genomic clone and 12.3 million 36mers from the Helicobacter acinonychis genome. We found that error rates range from 0.3% at the beginning of reads to 3.8% at the end of reads. Wrong base calls are frequently preceded by base G. Base substitution error frequencies vary by 10- to 11-fold, with A > C transversion being among the most frequent and C > G transversions among the least frequent substitution errors. Insertions and deletions of single bases occur at very low rates. When simulating re-sequencing we found a 20-fold sequencing coverage to be sufficient to compensate errors by correct reads. The read coverage of the sequenced regions is biased; the highest read density was found in intervals with elevated GC content. High Solexa quality scores are over-optimistic and low scores underestimate the data quality. Our results show different types of biases and ways to detect them. Such biases have implications on the use and interpretation of Solexa data, for de novo sequencing, re-sequencing, the identification of single nucleotide polymorphisms and DNA methylation sites, as well as for transcriptome analysis

CiteSeerX

MPG.PuRe

Assessing pooled BAC and whole genome shotgun strategies for assembly of complex genomes

Author: AH Paterson
B Steuernagel
E Pennisi
F Alex Feltus
GA Tuskan
IRGSP
J Shendure
JR Miller
Laxmi Parida
MC Schatz
Niina Haiminen
NL Quinn
O Jaillon
PS Schnable
S Gnerre
S Rounsley
SF Altschul
SF Altschul
SM Goldberg
T Wicker
T Wicker
Y Ding
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background We investigate if pooling BAC clones and sequencing the pools can provide for more accurate assembly of genome sequences than the "whole genome shotgun" (WGS) approach. Furthermore, we quantify this accuracy increase. We compare the pooled BAC and WGS approaches using <it>in silico </it>simulations. Standard measures of assembly quality focus on assembly size and fragmentation, which are desirable for large whole genome assemblies. We propose additional measures enabling easy and visual comparison of assembly quality, such as rearrangements and redundant sequence content, relative to the known target sequence. Results The best assembly quality scores were obtained using 454 coverage of 15× linear and 5× paired (3kb insert size) reads (15L-5P) on <it>Arabidopsis</it>. This regime gave similarly good results on four additional plant genomes of very different GC and repeat contents. BAC pooling improved assembly scores over WGS assembly, coverage and redundancy scores improving the most. Conclusions BAC pooling works better than WGS, however, both require a physical map to order the scaffolds. Pool sizes up to 12Mbp work well, suggesting this pooling density to be effective in medium-scale re-sequencing applications such as targeted sequencing of QTL intervals for candidate gene discovery. Assuming the current Roche/454 Titanium sequencing limitations, a 12 Mbp region could be re-sequenced with a full plate of linear reads and a half plate of paired-end reads, yielding 15L-5P coverage after read pre-processing. Our simulation suggests that massively over-sequencing may not improve accuracy. Our scoring measures can be used generally to evaluate and compare results of simulated genome assemblies.</p

Directory of Open Access Journals

Genome-wide comparison of Asian and African rice reveals high recent activity of DNA transposons

Author: AH Paterson
AJ Hartlerode
B Edlinger
B Piegu
C Feschotte
F Sabot
G Yang
G Yang
G Yang
GTH Vu
International Human Genome Sequencing Consortium
International Rice Genome Sequencing Project
JM Richardson
JP Buchmann
K Fujino
K Kikuchi
L Duret
LS Symington
M Kimura
M Wang
N Jiang
P Cao
P SanMiguel
R Kalendar
RH Plasterk
S Moon
S Ouyang
T Nakazaki
T Wicker
T Wicker
T Wicker
T Wicker
TD Wu
TE Bureau
TE Bureau
The International Brachypodium Initiative
V Robert
WR Engels
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study