10 research outputs found
Sequencing of 15 622 Gene-bearing BACs Clarifies the Gene-dense Regions of the Barley Genome
Barley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence. In 2012, the International Barley Sequencing Consortium released a resource integrating whole-genome shotgun sequences with a physical and genetic framework. However, because only 6278 bacterial artificial chromosome (BACs) in the physical map were sequenced, fine structure was limited. To gain access to the gene-containing portion of the barley genome at high resolution, we identified and sequenced 15 622 BACs representing the minimal tiling path of 72 052 physical-mapped gene-bearing BACs. This generated ~1.7 Gb of genomic sequence containing an estimated 2/3 of all Morex barley genes. Exploration of these sequenced BACs revealed that although distal ends of chromosomes contain most of the gene-enriched BACs and are characterized by high recombination rates, there are also gene-dense regions with suppressed recombination. We made use of published map-anchored sequence data from Aegilops tauschii to develop a synteny viewer between barley and the ancestor of the wheat D-genome. Except for some notable inversions, there is a high level of collinearity between the two species. The software HarvEST:Barley provides facile access to BAC sequences and their annotations, along with the barley–Ae. tauschii synteny viewer. These BAC sequences constitute a resource to improve the efficiency of marker development, map-based cloning, and comparative genomics in barley and related crops. Additional knowledge about regions of the barley genome that are gene-dense but low recombination is particularly relevant
Recommended from our members
Sequencing of 15 622 gene-bearing BACs clarifies the gene-dense regions of the barley genome
Barley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence. In 2012, the International Barley Sequencing Consortium released a resource integrating whole-genome shotgun sequences with a physical and genetic framework. However, because only 6278 bacterial artificial chromosome (BACs) in the physical map were sequenced, fine structure was limited. To gain access to the gene-containing portion of the barley genome at high resolution, we identified and sequenced 15 622 BACs representing the minimal tiling path of 72 052 physical-mapped gene-bearing BACs. This generated ~1.7 Gb of genomic sequence containing an estimated 2/3 of all Morex barley genes. Exploration of these sequenced BACs revealed that although distal ends of chromosomes contain most of the gene-enriched BACs and are characterized by high recombination rates, there are also gene-dense regions with suppressed recombination. We made use of published map-anchored sequence data from Aegilops tauschii to develop a synteny viewer between barley and the ancestor of the wheat D-genome. Except for some notable inversions, there is a high level of collinearity between the two species. The software HarvEST: Barley provides facile access to BAC sequences and their annotations, along with the barley– Ae. tauschii synteny viewer. These BAC sequences constitute a resource to improve the efficiency of marker development, map-based cloning, and comparative genomics in barley and related crops. Additional knowledge about regions of the barley genome that are gene-dense but low recombination is particularly relevant.Keywords: Aegilops tauschii,
Barley,
centromere BACs,
HarvEST:Barley,
gene distribution,
synteny,
recombination frequency,
Hordeum vulgare L.,
BAC sequencingThis is the publisher’s final pdf. The published article is copyrighted by the author(s) and published by John Wiley & Sons Ltd. on behalf of the Society for Experimental Biology. The published article can be found at: http://onlinelibrary.wiley.com/journal/10.1111/%28ISSN%291365-313X. Supporting information is available online at: http://onlinelibrary.wiley.com/doi/10.1111/tpj.12959/abstrac
Recommended from our members
Accurate Decoding of Pooled Sequenced Data Using Compressed Sensing
In order to overcome the limitations imposed by DNA barcoding when multiplexing a
large number of samples in the current generation of high-throughput sequencing
instruments, we have recently proposed a new protocol that leverages advances in
combinatorial pooling design (group testing) doi:10.1371/journal.pcbi.1003010. We have also
demonstrated how this new protocol would enable de novo selective sequencing and assembly
of large, highly-repetitive genomes. Here we address the problem of decoding pooled
sequenced data obtained from such a protocol. Our algorithm employs a synergistic
combination of ideas from compressed sensing and the decoding of error-correcting codes.
Experimental results on synthetic data for the rice genome and real data for the barley
genome show that our novel decoding algorithm enables significantly higher quality
assemblies than the previous approach
Recommended from our members
Combinatorial pooling enables selective sequencing of the barley gene space.
For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding
Combinatorial Pooling Enables Selective Sequencing of the Barley Gene Space
For the vast majority of species – including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a costeffective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated dat
Count distribution for the signatures of all distinct 26-mers [(a) rice synthetic data, (c) barley HV5] and all the reads [(b) rice synthetic data, (d) barley HV5] in the 91 pools of sequencing data.
<p>The x-axis represents the size of the signature and the y-axis is the absolute count.</p