89 research outputs found

    Towards plant pangenomics

    Get PDF
    As an increasing number of genome sequences become available for a wide range of species, there is a growing understanding that the genome of a single individual is insufficient to represent the gene diversity within a whole species. Many studies examine the sequence diversity within genes, and this allelic variation is an important source of phenotypic variation which can be selected for by man or nature. However, the significant gene presence/absence variation that has been observed within species and the impact of this variation on traits is only now being studied in detail. The sum of the genes for a species is termed the pangenome, and the determination and characterization of the pangenome is a requirement to understand variation within a species. In this review, we explore the current progress in pangenomics as well as methods and approaches for the characterization of pangenomes for a wide range of plant species

    Super-Pangenome by Integrating the Wild Side of a Species for Accelerated Crop Improvement

    Get PDF
    The pangenome provides genomic variations in the cultivated gene pool for a given species. However, as the crop’s gene pool comprises many species, especially wild relatives with diverse genetic stock, here we suggest using accessions from all available species of a given genus for the development of a more comprehensive and complete pangenome, which we refer to as a super-pangenome. The super-pangenome provides a complete genomic variation repertoire of a genus and offers unprecedented opportunities for crop improvement. This opinion article focuses on recent developments in crop pangenomics, the need for a super-pangenome that should include wild species, and its application for crop improvement

    Grain dispersal mechanism in cereals arose from a genome duplication followed by changes in spatial expression of genes involved in pollen development

    Get PDF
    KEY MESSAGE: Grain disarticulation in wild progenitor of wheat and barley evolved through a local duplication event followed by neo-functionalization resulting from changes in location of gene expression. ABSTRACT: One of the most critical events in the process of cereal domestication was the loss of the natural mode of grain dispersal. Grain dispersal in barley is controlled by two major genes, Btr1 and Btr2, which affect the thickness of cell walls around the disarticulation zone. The barley genome also encodes Btr1-like and Btr2-like genes, which have been shown to be the ancestral copies. While Btr and Btr-like genes are non-redundant, the biological function of Btr-like genes is unknown. We explored the potential biological role of the Btr-like genes by surveying their expression profile across 212 publicly available transcriptome datasets representing diverse organs, developmental stages and stress conditions. We found that Btr1-like and Btr2-like are expressed exclusively in immature anther samples throughout Prophase I of meiosis within the meiocyte. The similar and restricted expression profile of these two genes suggests they are involved in a common biological function. Further analysis revealed 141 genes co-expressed with Btr1-like and 122 genes co-expressed with Btr2-like, with 105 genes in common, supporting Btr-like genes involvement in a shared molecular pathway. We hypothesize that the Btr-like genes play a crucial role in pollen development by facilitating the formation of the callose wall around the meiocyte or in the secretion of callase by the tapetum. Our data suggest that Btr genes retained an ancestral function in cell wall modification and gained a new role in grain dispersal due to changes in their spatial expression becoming spike specific after gene duplication. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00122-022-04029-8

    Trait associations in the pangenome of pigeon pea ( Cajanus cajan )

    Get PDF
    Pigeon pea (Cajanus cajan) is an important orphan crop mainly grown by smallholder farmers inIndia and Africa. Here, we present the first pigeon pea pangenome based on 89 accessionsmainly from India and the Philippines, showing that there is significant genetic diversity inPhilippine individuals that is not present in Indian individuals. Annotation of variable genessuggests that they are associated with self-fertilization and response to disease. We identified225 SNPs associated with nine agronomically important traits over three locations and twodifferent time points, with SNPs associated with genes for transcription factors and kinases.These results will lead the way to an improved pigeon pea breeding programme

    The pangenome of hexaploid bread wheat

    Get PDF
    There is an increasing understanding that variation in gene presence–absence plays an important role in the heritability of agronomic traits; however, there have been relatively few studies on variation in gene pres- ence–absence in crop species. Hexaploid wheat is one of the most important food crops in the world and intensive breeding has reduced the genetic diversity of elite cultivars. Major efforts have produced draft genome assemblies for the cultivar Chinese Spring, but it is unknown how well this represents the genome diversity found in current modern elite cultivars. In this study we build an improved reference for Chinese Spring and explore gene diversity across 18 wheat cultivars. We predict a pangenome size of 140 500 102 genes, a core genome of 81 070 1631 genes and an average of 128 656 genes in each cultivar. Functional annotation of the variable gene set suggests that it is enriched for genes that may be associated with important agronomic traits. In addition to variation in gene presence, more than 36 million intervarietal sin- gle nucleotide polymorphisms were identified across the pangenome. This study of the wheat pangenome provides insight into genome diversity in elite wheat as a basis for genomics-based improvement of this important crop. A wheat pangenome, GBrowse, is available at http://appliedbioinformatics.com.au/cgi-bin/ gb2/gbrowse/WheatPan/, and data are available to download from http://wheatgenome.info/wheat_ge nome_databases.php

    The pangenome of an agronomically important crop plant Brassica oleracea

    Get PDF
    There is an increasing awareness that as a result of structural variation, a reference sequence representing a genome of a single individual is unable to capture all of the gene repertoire found in the species. A large number of genes affected by presence/absence and copy number variation suggest that it may contribute to phenotypic and agronomic trait diversity. Here we show by analysis of the Brassica oleracea pangenome that nearly 20% of genes are affected by presence/absence variation. Several genes displaying presence/absence variation are annotated with functions related to major agronomic traits, including disease resistance, flowering time, glucosinolate metabolism and vitamin biosynthesis

    An investigation of causes of false positive single nucleotide polymorphisms using simulated reads from a small eukaryote genome

    Get PDF
    Background: Single Nucleotide Polymorphisms (SNPs) are widely used molecular markers, and their use has increased massively since the inception of Next Generation Sequencing (NGS) technologies, which allow detection of large numbers of SNPs at low cost. However, both NGS data and their analysis are error-prone, which can lead to the generation of false positive (FP) SNPs. We explored the relationship between FP SNPs and seven factors involved in mapping-based variant calling - quality of the reference sequence, read length, choice of mapper and variant caller, mapping stringency and filtering of SNPs by read mapping quality and read depth. This resulted in 576 possible factor level combinations. We used error- and variant-free simulated reads to ensure that every SNP found was indeed a false positive. Results: The variation in the number of FP SNPs generated ranged from 0 to 36,621 for the 120 million base pairs (Mbp) genome. All of the experimental factors tested had statistically significant effects on the number of FP SNPs generated and there was a considerable amount of interaction between the different factors. Using a fragmented reference sequence led to a dramatic increase in the number of FP SNPs generated, as did relaxed read mapping and a lack of SNP filtering. The choice of reference assembler, mapper and variant caller also significantly affected the outcome. The effect of read length was more complex and suggests a possible interaction between mapping specificity and the potential for contributing more false positives as read length increases. Conclusions: The choice of tools and parameters involved in variant calling can have a dramatic effect on the number of FP SNPs produced, with particularly poor combinations of software and/or parameter settings yielding tens of thousands in this experiment. Between-factor interactions make simple recommendations difficult for a SNP discovery pipeline but the quality of the reference sequence is clearly of paramount importance. Our findings are also a stark reminder that it can be unwise to use the relaxed mismatch settings provided as defaults by some read mappers when reads are being mapped to a relatively unfinished reference sequence from e.g. a non-model organism in its early stages of genomic exploration

    Assembly and comparison of two closely related Brassica napus genomes

    Get PDF
    As an increasing number of plant genome sequences become available, it is clear that gene content varies between individuals, and the challenge arises to predict the gene content of a species. However, genome comparison is often confounded by variation in assembly and annotation. Differentiating between true gene absence and variation in assembly or annotation is essential for the accurate identification of conserved and variable genes in a species. Here, we present the de novo assembly of the B. napus cultivar Tapidor and comparison with an improved assembly of the Brassica napus cultivar Darmor-bzh. Both cultivars were annotated using the same method to allow comparison of gene content. We identified genes unique to each cultivar and differentiate these from artefacts due to variation in the assembly and annotation. We demonstrate that using a common annotation pipeline can result in different gene predictions, even for closely related cultivars, and repeat regions which collapse during assembly impact whole genome comparison. After accounting for differences in assembly and annotation, we demonstrate that the genome of Darmor-bzh contains a greater number of genes than the genome of Tapidor. Our results are the first step towards comparison of the true differences between B. napus genomes and highlight the potential sources of error in future production of a B. napus pangenome
    • …
    corecore