7 research outputs found

    Are we there yet? : reliably estimating the completeness of plant genome sequences

    Get PDF
    Genome sequencing is becoming cheaper and faster thanks to the introduction of next-generation sequencing techniques. Dozens of new plant genome sequences have been released in recent years, ranging from small to gigantic repeat-rich or polyploid genomes. Most genome projects have a dual purpose: delivering a contiguous, complete genome assembly and creating a full catalog of correctly predicted genes. Frequently, the completeness of a species' gene catalog is measured using a set of marker genes that are expected to be present. This expectation can be defined along an evolutionary gradient, ranging from highly conserved genes to species-specific genes. Large-scale population resequencing studies have revealed that gene space is fairly variable even between closely related individuals, which limits the definition of the expected gene space, and, consequently, the accuracy of estimates used to assess genome and gene space completeness. We argue that, based on the desired applications of a genome sequencing project, different completeness scores for the genome assembly and/or gene space should be determined. Using examples from several dicot and monocot genomes, we outline some pitfalls and recommendations regarding methods to estimate completeness during different steps of genome assembly and annotation

    Overcoming challenges in variant calling : exploring sequence diversity in candidate genes for plant development in perennial ryegrass (Lolium perenne)

    Get PDF
    Revealing DNA sequence variation within the Lolium perenne genepool is important for genetic analysis and development of breeding applications. We reviewed current literature on plant development to select candidate genes in pathways that control agronomic traits, and identified 503 orthologues in L. perenne. Using targeted resequencing, we constructed a comprehensive catalogue of genomic variation for a L. perenne germplasm collection of 736 genotypes derived from current cultivars, breeding material and wild accessions. To overcome challenges of variant calling in heterogeneous outbreeding species, we used two complementary strategies to explore sequence diversity. First, four variant calling pipelines were integrated with the VariantMetaCaller to reach maximal sensitivity. Additional multiplex amplicon sequencing was used to empirically estimate an appropriate precision threshold. Second, a de novo assembly strategy was used to reconstruct divergent alleles for each gene. The advantage of this approach was illustrated by discovery of 28 novel alleles of LpSDUF247, a polymorphic gene co-segregating with the S-locus of the grass self-incompatibility system. Our approach is applicable to other genetically diverse outbreeding species. The resulting collection of functionally annotated variants can be mined for variants causing phenotypic variation, either through genetic association studies, or by selecting carriers of rare defective alleles for physiological analyses

    Canonical correlations reveal adaptive loci and phenotypic responses to climate in perennial ryegrass

    Get PDF
    Germplasm from perennial ryegrass (Lolium perenne L.) natural populations is useful for breeding because of its adaptation to a wide range of climates. Climate‐adaptive genes can be detected from associations between genotype, phenotype and climate but an integrated framework for the analysis of these three sources of information is lacking. We used two approaches to identify adaptive loci in perennial ryegrass and their effect on phenotypic traits. First, we combined Genome‐Environment Association (GEA) and GWAS analyses. Then, we implemented a new test based on a Canonical Correlation Analysis (CANCOR) to detect adaptive loci. Furthermore, we improved the previous perennial ryegrass gene set by de novo gene prediction and functional annotation of 39,967 genes. GEA‐GWAS revealed eight outlier loci associated with both environmental variables and phenotypic traits. CANCOR retrieved 633 outlier loci associated with two climatic gradients, characterized by cold‐dry winter versus mild‐wet winter and long rainy season versus long summer, and pointed out traits putatively conferring adaptation at the extremes of these gradients. Our CANCOR test also revealed the presence of both polygenic and oligogenic climatic adaptations. Our gene annotation revealed that 374 of the CANCOR outlier loci were positioned within or close to a gene. Co‐association networks of outlier loci revealed a potential utility of CANCOR for investigating the interaction of genes involved in polygenic adaptations. The CANCOR test provides an integrated framework to analyse adaptive genomic diversity and phenotypic responses to environmental selection pressures that could be used to facilitate the adaptation of plant species to climate change

    Development of genomic resources for perennial ryegrass

    No full text

    Chromosome-scale assembly and annotation of the perennial ryegrass genome

    No full text
    Background The availability of chromosome-scale genome assemblies is fundamentally important to advance genetics and breeding in crops, as well as for evolutionary and comparative genomics. The improvement of long-read sequencing technologies and the advent of optical mapping and chromosome conformation capture technologies in the last few years, significantly promoted the development of chromosome-scale genome assemblies of model plants and crop species. In grasses, chromosome-scale genome assemblies recently became available for cultivated and wild species of the Triticeae subfamily. Development of state-of-the-art genomic resources in species of the Poeae subfamily, which includes important crops like fescues and ryegrasses, is lagging behind the progress in the cereal species. Results Here, we report a new chromosome-scale genome sequence assembly for perennial ryegrass, obtained by combining PacBio long-read sequencing, Illumina short-read polishing, BioNano optical mapping and Hi-C scaffolding. More than 90% of the total genome size of perennial ryegrass (approximately 2.55 Gb) is covered by seven pseudo-chromosomes that show high levels of collinearity to the orthologous chromosomes of Triticeae species. The transposon fraction of perennial ryegrass was found to be relatively low, approximately 35% of the total genome content, which is less than half of the genome repeat content of cultivated cereal species. We predicted 54,629 high-confidence gene models, 10,287 long non-coding RNAs and a total of 8,393 short non-coding RNAs in the perennial ryegrass genome. Conclusions The new reference genome sequence and annotation presented here are valuable resources for comparative genomic studies in grasses, as well as for breeding applications and will expedite the development of productive varieties in perennial ryegrass and related species

    Pooling resources: Allele frequency fingerprinting in <em>Lolium perenne</em>

    No full text
    International audienceAllele frequency fingerprinting of heterogeneous plant populations of outbreeding species can be used for variety identification, association mapping, genomic selection or characterization of genetic resources. In the FACCE-JPI GrassLandscape project, we empirically validated a pool-GBS method for Genome-Wide Allele Frequency Fingerprinting (GWAFF). As pool-GBS cannot be targeted to predefined loci such as candidate genes, we integrated it with targeted resequencing using a highly multiplexed amplicon sequencing strategy to measure allele frequencies (pool-HiPlex). A pool-HiPlex assay was designed that amplifies 185 amplicons in 41 L. perenne genes in just two parallel PCRreactions. We validated pool-GBS and pool-HiPlex using pools of 48 individuals, chosen to represent a wide range of genetic diversity in L. perenne, and addressed completeness, reproducibility and accuracy of allele frequencies with >1000 HiPlex SNPs and >150.000 GBS SNPs. We consistently found high correlations between allele frequencies obtained by genotyping individual plants and pool genotyping on leaf tissue pools and DNA extract pools. We also analyzed the error introduced at various steps of the protocol such as weighing, DNA-quantification, pooling, ligation and/or PCRamplification. Applying a minor allele frequency threshold of 5% or 3% effectively removed nonreproducible SNPs in pool-GBS and pool-HiPlex, respectively. Allele frequency spectra could be obtained for single SNPs as well as for haplotypes spanning neighboring SNPs using read-backed phasing. Application of this methodology to a set of 470 natural populations of L. perenne sampled across Europe and the fertile Crescent revealed a geographical pattern of genetic differentiation in this species
    corecore