Search CORE

1,073 research outputs found

Assessing pooled BAC and whole genome shotgun strategies for assembly of complex genomes

Author: AH Paterson
B Steuernagel
E Pennisi
F Alex Feltus
GA Tuskan
IRGSP
J Shendure
JR Miller
Laxmi Parida
MC Schatz
Niina Haiminen
NL Quinn
O Jaillon
PS Schnable
S Gnerre
S Rounsley
SF Altschul
SF Altschul
SM Goldberg
T Wicker
T Wicker
Y Ding
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background We investigate if pooling BAC clones and sequencing the pools can provide for more accurate assembly of genome sequences than the "whole genome shotgun" (WGS) approach. Furthermore, we quantify this accuracy increase. We compare the pooled BAC and WGS approaches using <it>in silico </it>simulations. Standard measures of assembly quality focus on assembly size and fragmentation, which are desirable for large whole genome assemblies. We propose additional measures enabling easy and visual comparison of assembly quality, such as rearrangements and redundant sequence content, relative to the known target sequence. Results The best assembly quality scores were obtained using 454 coverage of 15× linear and 5× paired (3kb insert size) reads (15L-5P) on <it>Arabidopsis</it>. This regime gave similarly good results on four additional plant genomes of very different GC and repeat contents. BAC pooling improved assembly scores over WGS assembly, coverage and redundancy scores improving the most. Conclusions BAC pooling works better than WGS, however, both require a physical map to order the scaffolds. Pool sizes up to 12Mbp work well, suggesting this pooling density to be effective in medium-scale re-sequencing applications such as targeted sequencing of QTL intervals for candidate gene discovery. Assuming the current Roche/454 Titanium sequencing limitations, a 12 Mbp region could be re-sequenced with a full plate of linear reads and a half plate of paired-end reads, yielding 15L-5P coverage after read pre-processing. Our simulation suggests that massively over-sequencing may not improve accuracy. Our scoring measures can be used generally to evaluate and compare results of simulated genome assemblies.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Unusual patterns of genetic diversity and gene expression in the maize genome

Author: Li Li
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2009
Field of study

Maize (Zea mays subsp mays) was domesticated from teosinte (Z. mays subsp parviglumis) in southern Mexico between 6,000 and 9,000 years ago (Matsuoka et al., 2002; Sluyter and Dominguez, 2006). Both domestication and crop improvement involved selection of specific alleles at genes, resulting in reduced genetic diversity in the genes controlling key morphological and agronomic traits. This is termed the genetic bottleneck . We coupled the approaches of molecular population genetics with reverse genetics to associate genes with phenotypes. More than 16,000 primer pairs were subjected to gel and temperature gradient capillary electrophoresis (TGCE)-based assays. This screen identified 73 genes that contain zero sequence diversity (ZSD) fragments. They are monomorphic among 59 diverse maize lines, but polymorphic among 9 teosinte lines. Therefore, they are candidate domestication-related genes. Using 3,000 Mutator-insertion lines, a large-scaled screen for Mu transposon insertions in domestication candidate genes was performed. Our data supports the bottleneck model and at least 0.5% of maize genes are under selection during maize domestication based on our test. Phenotypic analysis of plants homozygous for the Mu-insertion alleles of the domestication candidate genes are underway. We also detected two other interesting features in the maize genome: First, the existence of nearly identical paralogs (NIPs) and orphan genes. Our data suggested that at least ~1% of maize genes are members of a NIP family, defined as paralogous genes that exhibit \u3e=98% identity (Emrich et al. 2007). Members of a NIP family are expressed and in some instances, members of a given NIP family exhibit differential patterns of gene expression. NIPs may have played important roles during the evolution and domestication of maize. NIPs expression data supports subfunctionalization model for duplicated genes. Besides, NIPs were also detected in other maize inbred lines. Second, ~400 orphan transcripts were captured via 454 sequencing of cDNA, isolated using laser capture microdissection (LCM) from functionally important shoot apical meristems (SAMs). The expression of 27 randomly picked cDNA was validated via RT-PCR. Expression of 20 of these SAM-expressed genes (~74%) were not detected in meristem-rich immature ears. 454 sequenced cDNAs, isolated by LCM from B73 and Mo17 SAM, enabled us to detect gene-associated SNPs, which escaped previous tests

Digital Repository @ Iowa State University (ISU)

The ALDH gene superfamily of Arabidopsis

Author: Bartels D.
Kirch H.
Schnable P.
Wei Y.
Wood A.
Publication venue
Publication date: 01/08/2004
Field of study

MPG.PuRe

Physical and Genetic Structure of the Maize Genome Reflects Its Complex Evolutionary History

Author: Andrew H Paterson
Arvind K Bharti
Carol Soderlund
Ed Butler
Ed Coe
Fred Engler
Fusheng Wei
Galina Fuks
Georgia Davis
Hector Sanchez-Villeda
HyeRan Kim
International Rice Genome Sequencing Project
Jack Gardiner
Joachim Messing
John E Bowers
Jose Luis Goicoechea
Joseph R Ecker
Karen Cone
Mary Schaeffer
Michael McMullen
Mingsheng Chen
Rice Chromosomes 11 and 12 Sequencing Consortia
Rod A Wing
Seunghee Lee
Steven Schroeder
William Nelson
Zhiwei Fang
Publication venue: Public Library of Science
Publication date: 01/07/2007
Field of study

Maize (Zea mays L.) is one of the most important cereal crops and a model for the study of genetics, evolution, and domestication. To better understand maize genome organization and to build a framework for genome sequencing, we constructed a sequence-ready fingerprinted contig-based physical map that covers 93.5% of the genome, of which 86.1% is aligned to the genetic map. The fingerprinted contig map contains 25,908 genic markers that enabled us to align nearly 73% of the anchored maize genome to the rice genome. The distribution pattern of expressed sequence tags correlates to that of recombination. In collinear regions, 1 kb in rice corresponds to an average of 3.2 kb in maize, yet maize has a 6-fold genome size expansion. This can be explained by the fact that most rice regions correspond to two regions in maize as a result of its recent polyploid origin. Inversions account for the majority of chromosome structural variations during subsequent maize diploidization. We also find clear evidence of ancient genome duplication predating the divergence of the progenitors of maize and rice. Reconstructing the paleoethnobotany of the maize genome indicates that the progenitors of modern maize contained ten chromosomes

Crossref

Directory of Open Access Journals

PubMed Central

University of Queensland eSpace

Assembling genomes using short-read sequencing technology

Author: Birol İnanç
Jackman Shaun D
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Short-read sequencing technology can bring gigabase genome assemblies in under a million dollars

Crossref

PubMed Central

Complete Chloroplast Genome Sequence of a Major Allogamous Forage Species, Perennial Ryegrass (Lolium perenne L.)

Author: Calsa J  nior
Chen
Chen
Chung
Corneille
Cummings
Daniell
Daniell
Diekmann
Drescher
Freyer
Hoch
Hodkinson
Igloi
Inada
Jansen
K. Diekmann
K. H. Wolfe
Keeling
Kelchner
Kim
Kim
Kumar
Lilly
Maier
McGrath
Ogihara
P. J. Dix
R. van den Bekerom
Ruf
S. Barth
Sato
Shimada
Sugita
Sugiura
S  ll
T. R. Hodkinson
Timme
Tsudzuki
Tsumura
Vogel
Wegl  hner
Yukawa
Zeltz
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Lolium perenne L. (perennial ryegrass) is globally one of the most important forage and grassland crops. We sequenced the chloroplast (cp) genome of Lolium perenne cultivar Cashel. The L. perenne cp genome is 135 282 bp with a typical quadripartite structure. It contains genes for 76 unique proteins, 30 tRNAs and four rRNAs. As in other grasses, the genes accD, ycf1 and ycf2 are absent. The genome is of average size within its subfamily Pooideae and of medium size within the Poaceae. Genome size differences are mainly due to length variations in non-coding regions. However, considerable length differences of 1–27 codons in comparison of L. perenne to other Poaceae and 1–68 codons among all Poaceae were also detected. Within the cp genome of this outcrossing cultivar, 10 insertion/deletion polymorphisms and 40 single nucleotide polymorphisms were detected. Two of the polymorphisms involve tiny inversions within hairpin structures. By comparing the genome sequence with RT–PCR products of transcripts for 33 genes, 31 mRNA editing sites were identified, five of them unique to Lolium. The cp genome sequence of L. perenne is available under Accession number AM777385 at the European Molecular Biology Laboratory, National Center for Biotechnology Information and DNA DataBank of Japan

Crossref

MURAL - Maynooth University Research Archive Library

PubMed Central

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

SNP discovery via 454 transcriptome sequencing

Author: Batley
Bennetzen
Bray
Ching
Cowles
Dantec
Emrich
Emrich
Emrich
Feltus
Fu
Fu
Fu
Guo
Gut
Jander
Kota
Kota
Kwok
Lopez
Manaster
Margulies
Marth
Meyers
Nickerson
Pavy
Rafalski
Sachidanandam
SanMiguel
Schnable
Stupar
Tenaillon
Tenaillon
Useche
Usuka
Wang
Wang
Weckx
Whitelaw
Whitt
Wiltshire
Wright
Yamasaki
Zhang
Publication venue: Blackwell Publishing Ltd
Publication date
Field of study

A massively parallel pyro-sequencing technology commercialized by 454 Life Sciences Corporation was used to sequence the transcriptomes of shoot apical meristems isolated from two inbred lines of maize using laser capture microdissection (LCM). A computational pipeline that uses the POLYBAYES polymorphism detection system was adapted for 454 ESTs and used to detect SNPs (single nucleotide polymorphisms) between the two inbred lines. Putative SNPs were computationally identified using 260 000 and 280 000 454 ESTs from the B73 and Mo17 inbred lines, respectively. Over 36 000 putative SNPs were detected within 9980 unique B73 genomic anchor sequences (MAGIs). Stringent post-processing reduced this number to > 7000 putative SNPs. Over 85% (94/110) of a sample of these putative SNPs were successfully validated by Sanger sequencing. Based on this validation rate, this pilot experiment conservatively identified > 4900 valid SNPs within > 2400 maize genes. These results demonstrate that 454-based transcriptome sequencing is an excellent method for the high-throughput acquisition of gene-associated SNPs

Crossref

PubMed Central

Mu Transposon Insertion Sites and Meiotic Recombination Events Co-Localize with Epigenetic Marks for Open Chromatin across the Maize Genome

Author: A Kong
A Miyao
AC Spradling
AD Cresse
AM Settles
BP May
C Feschotte
C Soderlund
C Yan
Cheng-Ting Yeh
CJ Wang
CR Dietrich
CY Kramer
D Lisch
D Lisch
D Mester
Dan Nettleton
DI Mester
DR McCarty
DS Robertson
E Coe
ED Akhunov
F Qiu
GC Liao
GK Wong
GT Marth
H Candela
H Santos-Rosa
Haiyan Wu
Harmit S. Malik
HK Dooner
Ho Man Tang
J Fernandes
J Li
JL Bennetzen
JL Gerton
JM Kolkman
K Fengler
K Ohtsu
Kai Ying
KC Cone
KJ Hardeman
L Das
LE Palmer
M Alleman
M Falque
M Yamazaki
MN Raizada
MP Ball
N Jiang
P SanMiguel
Patrick S. Schnable
PS Schnable
R Lister
S Hake
S Hanley
S Liu
S Liu
Sanzhen Liu
SJ Cokus
SJ Emrich
SM Fullerton
SN Wood
SN Wood
TD Wu
Tieming Ji
TK Wolfgruber
TP Brutnell
V Borde
V Walbot
WS Cleveland
X Li
X Wang
X Zhang
Y Fu
Y Fu
Yan Fu
Publication venue: Public Library of Science
Publication date: 01/11/2009
Field of study

The Mu transposon system of maize is highly active, with each of the ∼50–100 copies transposing on average once each generation. The approximately one dozen distinct Mu transposons contain highly similar ∼215 bp terminal inverted repeats (TIRs) and generate 9-bp target site duplications (TSDs) upon insertion. Using a novel genome walking strategy that uses these conserved TIRs as primer binding sites, Mu insertion sites were amplified from Mu stocks and sequenced via 454 technology. 94% of ∼965,000 reads carried Mu TIRs, demonstrating the specificity of this strategy. Among these TIRs, 21 novel Mu TIRs were discovered, revealing additional complexity of the Mu transposon system. The distribution of >40,000 non-redundant Mu insertion sites was strikingly non-uniform, such that rates increased in proportion to distance from the centromere. An identified putative Mu transposase binding consensus site does not explain this non-uniformity. An integrated genetic map containing more than 10,000 genetic markers was constructed and aligned to the sequence of the maize reference genome. Recombination rates (cM/Mb) are also strikingly non-uniform, with rates increasing in proportion to distance from the centromere. Mu insertion site frequencies are strongly correlated with recombination rates. Gene density does not fully explain the chromosomal distribution of Mu insertion and recombination sites, because pronounced preferences for the distal portion of chromosome are still observed even after accounting for gene density. The similarity of the distributions of Mu insertions and meiotic recombination sites suggests that common features, such as chromatin structure, are involved in site selection for both Mu insertion and meiotic recombination. The finding that Mu insertions and meiotic recombination sites both concentrate in genomic regions marked with epigenetic marks of open chromatin provides support for the hypothesis that open chromatin enhances rates of both Mu insertion and meiotic recombination

Digital Repository @ Iowa State University (ISU)

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central