81 research outputs found

    Construction, Characterization, and Preliminary BAC-End Sequence Analysis of a Bacterial Artificial Chromosome Library of the Tea Plant (Camellia sinensis)

    Get PDF
    We describe the construction and characterization of a publicly available BAC library for the tea plant, Camellia sinensis. Using modified methods, the library was constructed with the aim of developing public molecular resources to advance tea plant genomics research. The library consists of a total of 401,280 clones with an average insert size of 135 kb, providing an approximate coverage of 13.5 haploid genome equivalents. No empty vector clones were observed in a random sampling of 576 BAC clones. Further analysis of 182 BAC-end sequences from randomly selected clones revealed a GC content of 40.35% and low chloroplast and mitochondrial contamination. Repetitive sequence analyses indicated that LTR retrotransposons were the most predominant sequence class (86.93%–87.24%), followed by DNA retrotransposons (11.16%–11.69%). Additionally, we found 25 simple sequence repeats (SSRs) that could potentially be used as genetic markers

    Transposable element distribution, abundance and role in genome size variation in the genus Oryza

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The genus <it>Oryza </it>is composed of 10 distinct genome types, 6 diploid and 4 polyploid, and includes the world's most important food crop – rice (<it>Oryza sativa </it>[AA]). Genome size variation in the <it>Oryza </it>is more than 3-fold and ranges from 357 Mbp in <it>Oryza glaberrima </it>[AA] to 1283 Mbp in the polyploid <it>Oryza ridleyi </it>[HHJJ]. Because repetitive elements are known to play a significant role in genome size variation, we constructed random sheared small insert genomic libraries from 12 representative <it>Oryza </it>species and conducted a comprehensive study of the repetitive element composition, distribution and phylogeny in this genus. Particular attention was paid to the role played by the most important classes of transposable elements (Long Terminal Repeats Retrotransposons, Long interspersed Nuclear Elements, helitrons, DNA transposable elements) in shaping these genomes and in their contributing to genome size variation.</p> <p>Results</p> <p>We identified the elements primarily responsible for the most strikingly genome size variation in <it>Oryza</it>. We demonstrated how Long Terminal Repeat retrotransposons belonging to the same families have proliferated to very different extents in various species. We also showed that the pool of Long Terminal Repeat Retrotransposons is substantially conserved and ubiquitous throughout the <it>Oryza </it>and so its origin is ancient and its existence predates the speciation events that originated the genus. Finally we described the peculiar behavior of repeats in the species <it>Oryza coarctata </it>[HHKK] whose placement in the <it>Oryza </it>genus is controversial.</p> <p>Conclusion</p> <p>Long Terminal Repeat retrotransposons are the major component of the <it>Oryza </it>genomes analyzed and, along with polyploidization, are the most important contributors to the genome size variation across the <it>Oryza </it>genus. Two families of Ty3-<it>gypsy </it>elements (<it>RIRE2 </it>and <it>Atlantys</it>) account for a significant portion of the genome size variations present in the <it>Oryza </it>genus.</p

    DNA methylation changes facilitated evolution of genes derived from Mutator-like transposable elements

    Get PDF
    Supplementary file S2. Accession numbers and URLs for genome assembly, transcriptome and methylome data that used in this project. (DOCX 101 kb

    Improved chromosome-level genome assembly and annotation of the seagrass, Zostera marina (eelgrass)

    Get PDF
    Background: Seagrasses (Alismatales) are the only fully marine angiosperms. Zostera marina (eelgrass) plays a crucial role in the functioning of coastal marine ecosystems and global carbon sequestration. It is the most widely studied seagrass and has become a marine model system for exploring adaptation under rapid climate change. The original draft genome (v.1.0) of the seagrass Z. marina (L.) was based on a combination of Illumina mate-pair libraries and fosmid-ends. A total of 25.55 Gb of Illumina and 0.14 Gb of Sanger sequence was obtained representing 47.7× genomic coverage. The assembly resulted in ~2000 unordered scaffolds (L50 of 486 Kb), a final genome assembly size of 203MB, 20,450 protein coding genes and 63% TE content. Here, we present an upgraded chromosome-scale genome assembly and compare v.1.0 and the new v.3.1, reconfirming previous results from Olsen et al. (2016), as well as pointing out new findings. Methods: The same high molecular weight DNA used in the original sequencing of the Finnish clone was used. A high-quality reference genome was assembled with the MECAT assembly pipeline combining PacBio long-read sequencing and Hi-C scaffolding. Results: In total, 75.97 Gb PacBio data was produced. The final assembly comprises six pseudo-chromosomes and 304 unanchored scaffolds with a total length of 260.5Mb and an N50 of 34.6 MB, showing high contiguity and few gaps (~0.5%). 21,483 protein-encoding genes are annotated in this assembly, of which 20,665 (96.2%) obtained at least one functional assignment based on similarity to known proteins. Conclusions: As an important marine angiosperm, the improved Z. marina genome assembly will further assist evolutionary, ecological, and comparative genomics at the chromosome level. The new genome assembly will further our understanding into the structural and physiological adaptations from land to marine life

    Construction of a bacterial artificial chromosome library from the spikemoss Selaginella moellendorffii: a new resource for plant comparative genomics

    Get PDF
    BACKGROUND: The lycophytes are an ancient lineage of vascular plants that diverged from the seed plant lineage about 400 Myr ago. Although the lycophytes occupy an important phylogenetic position for understanding the evolution of plants and their genomes, no genomic resources exist for this group of plants. RESULTS: Here we describe the construction of a large-insert bacterial artificial chromosome (BAC) library from the lycophyte Selaginella moellendorffii. Based on cell flow cytometry, this species has the smallest genome size among the different lycophytes tested, including Huperzia lucidula, Diphaiastrum digita, Isoetes engelmanii and S. kraussiana. The arrayed BAC library consists of 9126 clones; the average insert size is estimated to be 122 kb. Inserts of chloroplast origin account for 2.3% of the clones. The BAC library contains an estimated ten genome-equivalents based on DNA hybridizations using five single-copy and two duplicated S. moellendorffii genes as probes. CONCLUSION: The S. moellenforffii BAC library, the first to be constructed from a lycophyte, will be useful to the scientific community as a resource for comparative plant genomics and evolution

    Construction of a nurse shark (Ginglymostoma cirratum) bacterial artificial chromosome (BAC) library and a preliminary genome survey

    Get PDF
    BACKGROUND: Sharks are members of the taxonomic class Chondrichthyes, the oldest living jawed vertebrates. Genomic studies of this group, in comparison to representative species in other vertebrate taxa, will allow us to theorize about the fundamental genetic, developmental, and functional characteristics in the common ancestor of all jawed vertebrates. AIMS: In order to obtain mapping and sequencing data for comparative genomics, we constructed a bacterial artificial chromosome (BAC) library for the nurse shark, Ginglymostoma cirratum. RESULTS: The BAC library consists of 313,344 clones with an average insert size of 144 kb, covering ~4.5 × 10(10 )bp and thus providing an 11-fold coverage of the haploid genome. BAC end sequence analyses revealed, in addition to LINEs and SINEs commonly found in other animal and plant genomes, two new groups of nurse shark-specific repetitive elements, NSRE1 and NSRE2 that seem to be major components of the nurse shark genome. Screening the library with single-copy or multi-copy gene probes showed 6–28 primary positive clones per probe of which 50–90% were true positives, demonstrating that the BAC library is representative of the different regions of the nurse shark genome. Furthermore, some BAC clones contained multiple genes, making physical mapping feasible. CONCLUSION: We have constructed a deep-coverage, high-quality, large insert, and publicly available BAC library for a cartilaginous fish. It will be very useful to the scientific community interested in shark genomic structure, comparative genomics, and functional studies. We found two new groups of repetitive elements specific to the nurse shark genome, which may contribute to the architecture and evolution of the nurse shark genome

    Integrating a newly developed BAC-based physical mapping resource for Lolium perenne with a genome-wide association study across a L. Perenne European ecotype collection identifies genomic contexts associated with agriculturally important traits

    Get PDF
    Background and Aims Lolium perenne (perennial ryegrass) is the most widely cultivated forage and amenity grass species in temperate areas worldwide and there is a need to understand the genetic architectures of key agricultural traits and crop characteristics that deliver wider environmental services. Our aim was to identify genomic regions associated with agriculturally important traits by integrating a bacterial artificial chromosome (BAC)-based physical map with a genome-wide association study (GWAS). Methods BAC-based physical maps for L. perenne were constructed from similar to 212 000 high-information-content fingerprints using Fingerprint Contig and Linear Topology Contig software. BAC clones were associated with both BAC-end sequences and a partial minimum tiling path sequence. A panel of 716 L. perenne diploid genotypes from 90 European accessions was assessed in the field over 2 years, and genotyped using a Lolium Infinium SNP array. The GWAS was carried out using a linear mixed model implemented in TASSEL, and extended genomic regions associated with significant markers were identified through integration with the physical map. Key Results Between similar to 3600 and 7500 physical map contigs were derived, depending on the software and probability thresholds used, and integrated with similar to 35 k sequenced BAC clones to develop a resource predicted to span the majority of the L. perenne genome. From the GWAS, eight different loci were significantly associated with heading date, plant width, plant biomass and water-soluble carbohydrate accumulation, seven of which could be associated with physical map contigs. This allowed the identification of a number of candidate genes. Conclusions Combining the physical mapping resource with the GWAS has allowed us to extend the search for candidate genes across larger regions of the L. perenne genome and identified a number of interesting gene model annotations. These physical maps will aid in validating future sequence-based assemblies of the L. perenne genome.UK Biotechnology and Biological Sciences Research Council [BB/J004405/1, BB/CSP1730/1, BB/G012342/1]; Germinal Holdings (UK); Syngenta (UK); Vialactia Biosciences (NZ)Open access articleThis item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at [email protected]

    An Integrated Physical, Genetic and Cytogenetic Map of Brachypodium distachyon, a Model System for Grass Research

    Get PDF
    The pooid subfamily of grasses includes some of the most important crop, forage and turf species, such as wheat, barley and Lolium. Developing genomic resources, such as whole-genome physical maps, for analysing the large and complex genomes of these crops and for facilitating biological research in grasses is an important goal in plant biology. We describe a bacterial artificial chromosome (BAC)-based physical map of the wild pooid grass Brachypodium distachyon and integrate this with whole genome shotgun sequence (WGS) assemblies using BAC end sequences (BES). The resulting physical map contains 26 contigs spanning the 272 Mb genome. BES from the physical map were also used to integrate a genetic map. This provides an independent vaildation and confirmation of the published WGS assembly. Mapped BACs were used in Fluorescence In Situ Hybridisation (FISH) experiments to align the integrated physical map and sequence assemblies to chromosomes with high resolution. The physical, genetic and cytogenetic maps, integrated with whole genome shotgun sequence assemblies, enhance the accuracy and durability of this important genome sequence and will directly facilitate gene isolation

    Improved chromosome-level genome assembly and annotation of the seagrass, Zostera marina (eelgrass)

    Get PDF
    BACKGROUND : Seagrasses (Alismatales) are the only fully marine angiosperms. Zostera marina (eelgrass) plays a crucial role in the functioning of coastal marine ecosystems and global carbon sequestration. It is the most widely studied seagrass and has become a marine model system for exploring adaptation under rapid climate change. The original draft genome (v.1.0) of the seagrass Z. marina (L.) was based on a combination of Illumina mate-pair libraries and fosmid-ends. A total of 25.55 Gb of Illumina and 0.14 Gb of Sanger sequence was obtained representing 47.7× genomic coverage. The assembly resulted in ~2000 unordered scaffolds (L50 of 486 Kb), a final genome assembly size of 203MB, 20,450 protein coding genes and 63% TE content. Here, we present an upgraded chromosome-scale genome assembly and compare v.1.0 and the new v.3.1, reconfirming previous results from Olsen et al. (2016), as well as pointing out new findings. METHODS : The same high molecular weight DNA used in the original sequencing of the Finnish clone was used. A highquality reference genome was assembled with the MECAT assembly pipeline combining PacBio longread sequencing and Hi-C scaffolding. RESULTS : In total, 75.97 Gb PacBio data was produced. The final assembly comprises six pseudo-chromosomes and 304 unanchored scaffolds with a total length of 260.5Mb and an N50 of 34.6 MB, showing high contiguity and few gaps (~0.5%). 21,483 proteinencoding genes are annotated in this assembly, of which 20,665 (96.2%) obtained at least one functional assignment based on similarity to known proteins. CONCLUSIONS : As an important marine angiosperm, the improved Z. marina genome assembly will further assist evolutionary, ecological, and comparative genomics at the chromosome level. The new genome assembly will further our understanding into the structural and physiological adaptations from land to marine life.The DOE-Joint Genome Institute, Berkeley, CA, USA, Community Sequencing Program 2019.http://f1000research.com/am2022BiochemistryGeneticsMicrobiology and Plant Patholog
    corecore