3,131 research outputs found

    Genomic and phenotypic characterization of finger millet indicates a complex diversification history

    Get PDF
    Advances in sequencing technologies mean that insights into crop diversification can now be explored in crops beyond major staples. We use a genome assembly of finger millet, an allotetraploid orphan crop, to analyze DArTseq single nucleotide polymorphisms (SNPs) at the whole and sub‐genome level. A set of 8778 SNPs and 13 agronomic traits was used to characterize a diverse panel of 423 landraces from Africa and Asia. Through principal component analysis (PCA) and discriminant analysis of principal components, four distinct groups of accessions were identified that coincided with the primary geographic regions of finger millet cultivation. Notably, East Africa, presumed to be the crop's origin, exhibited the lowest genetic diversity. The PCA of phenotypic data also revealed geographic differentiation, albeit with differing relationships among geographic areas than indicated with genomic data. Further exploration of the sub‐genomes A and B using neighbor‐joining trees revealed distinct features that provide supporting evidence for the complex evolutionary history of finger millet. Although genome‐wide association study found only a limited number of significant marker‐trait associations, a clustering approach based on the distribution of marker effects obtained from a ridge regression genomic model was employed to investigate trait complexity. This analysis uncovered two distinct clusters. Overall, the findings suggest that finger millet has undergone complex and context‐specific diversification, indicative of a lengthy domestication history. These analyses provide insights for the future development of finger millet

    New computational methods and plant models for evolutionary genomics

    Get PDF
    This thesis is in the service of a greater understanding of the genetic basis of adaptive traits. Chapter 1 introduces background literature relevant to this thesis. Chapters 2, 3, and 4 develop novel methods and software for the analysis of genetic sequencing data. Chapter 5 details a large collaborative project to establish genetic resources in the model cereal Brachypodium, and perform a genome-wide association study for several agriculturally-relevant traits under two climate change scenarios. Chapter 6 investigates the spatial genetic patterns in two species of woodland eucalypt, and determines the landscape process that could be driving these patterns. Finally, Chapter 7 summarises these works, and proposes some areas of further study. In Chapters 2 and 3, I develop methods that enable analysis of Genotyping-by-sequencing analysis. Axe, a short read sequence demultiplexer, demultiplexes samples from multiplexed GBS sequencing datasets. I show Axe has high accuracy, and outperforms previously published software. Axe also tolerates complex indexing schemes such as the variable-length combinatorial indexes used in GBS data. Trimit and libqcpp (Chapter 3) implements several low-level sequence read quality assessment and control methods as a C++ library, and as a command line tool. Both these works have been published in peer-reviewed journals, and are used by numerous groups internationally. In Chapter 4, I develop kWIP, a de novo estimator of genetic distance. kWIP enables rapid estimation of genetic distances directly from sequence reads. We first show kWIP outperforms a competing method at low coverage using simulations that mimic a population resequencing experiment. We propose and demonstrate several use cases for kWIP, including population resequencing, initial assessment of sample identity, and estimating metagenomic similarity. kWIP was published in PLoS Computational Biology. In Chapter 5, I present the results of a large, collaborative project which surveys the global genetic diversity of the model cereal Brachypodium. We amass a collection of over 2000 accessions from the Brachypodium species complex. Using GBS and whole genome sequencing we identify around 800 accessions of the diploid Brachypodium distachyon, within which we find extensive population structure and clonal families. Through population restructuring we create a core collection of 74 accessions containing the majority of genetic diversity in the "A genome" sub-population. Using this core collection, we assay several phenotypes of agricultural interest including early vigour, harvest index and energy use efficiency under two climates, and dissect the genetic basis of these traits using a genome-wide association study (GWAS). This work has been accepted for publication at Genetics; I am co-first author with Pip Wilson and Jared Streich, having lead many genomic analyses. In Chapter 6, I perform a study of landscape genomic variation in two woodland eucalypt species. Using whole genome sequencing of around 200 individuals from around 20 localities of both E. albens and E. sideroxylon, I find incredible genetic diversity and low genome-wide inter-species differentiation.I find no support for strong discrete population structure, but strong support for isolation by (geographic) distance (IBD). Using generalised dissimilarity modelling, I further examine the pattern of IBD, and establish additional isolation by environment (IBE). E. albens shows moderately strong IBD, explaining 26% of deviance in genetic distance using geographic distance, and an additional 6% deviance explained by incorporating environmental predictors (IBE). E. sideroxylon shows much stronger IBD, with 78% of deviance explained by geography, and stronger IBE (12% additional deviance explained). This work will soon be submitted for publication

    Data-driven, participatory characterization of farmer varieties discloses teff breeding potential under current and future climates

    Get PDF
    In smallholder farming systems, traditional farmer varieties of neglected and underutilized species (NUS) support the livelihoods of millions of growers and consumers. NUS combine cultural and agronomic value with local adaptation, and transdisciplinary methods are needed to fully evaluate their breeding potential. Here, we assembled and characterized the genetic diversity of a representative collection of 366 Ethiopian teff (Eragrostis tef) farmer varieties and breeding materials, describing their phylogenetic relations and local adaptation on the Ethiopian landscape. We phenotyped the collection for its agronomic performance, involving local teff farmers in a participatory variety evaluation. Our analyses revealed environmental patterns of teff genetic diversity and allowed us to identify 10 genetic clusters associated with climate variation and with uneven spatial distribution. A genome-wide association study was used to identify loci and candidate genes related to phenology, yield, local adaptation, and farmers' appreciation. The estimated teff genomic offset under climate change scenarios highlighted an area around lake Tana where teff cropping may be most vulnerable to climate change. Our results show that transdisciplinary approaches may efficiently propel untapped NUS farmer varieties into modern breeding to foster more resilient and sustainable cropping systems

    Ontologies for increasing the FAIRness of plant research data

    Full text link
    The importance of improving the FAIRness (findability, accessibility, interoperability, reusability) of research data is undeniable, especially in the face of large, complex datasets currently being produced by omics technologies. Facilitating the integration of a dataset with other types of data increases the likelihood of reuse, and the potential of answering novel research questions. Ontologies are a useful tool for semantically tagging datasets as adding relevant metadata increases the understanding of how data was produced and increases its interoperability. Ontologies provide concepts for a particular domain as well as the relationships between concepts. By tagging data with ontology terms, data becomes both human and machine interpretable, allowing for increased reuse and interoperability. However, the task of identifying ontologies relevant to a particular research domain or technology is challenging, especially within the diverse realm of fundamental plant research. In this review, we outline the ontologies most relevant to the fundamental plant sciences and how they can be used to annotate data related to plant-specific experiments within metadata frameworks, such as Investigation-Study-Assay (ISA). We also outline repositories and platforms most useful for identifying applicable ontologies or finding ontology terms.Comment: 34 pages, 4 figures, 1 table, 1 supplementary tabl

    An integrated molecular and conventional breeding scheme for enhancing genetic gain in maize in Africa

    Get PDF
    Open Access Journal; Published online: 06 Nov 2019Maize production in West and Central Africa (WCA) is constrained by a wide range of interacting stresses that keep productivity below potential yields. Among the many problems afflicting maize production in WCA, drought, foliar diseases, and parasitic weeds are the most critical. Several decades of efforts devoted to the genetic improvement of maize have resulted in remarkable genetic gain, leading to increased yields of maize on farmers’ fields. The revolution unfolding in the areas of genomics, bioinformatics, and phenomics is generating innovative tools, resources, and technologies for transforming crop breeding programs. It is envisaged that such tools will be integrated within maize breeding programs, thereby advancing these programs and addressing current and future challenges. Accordingly, the maize improvement program within International Institute of Tropical Agriculture (IITA) is undergoing a process of modernization through the introduction of innovative tools and new schemes that are expected to enhance genetic gains and impact on smallholder farmers in the region. Genomic tools enable genetic dissections of complex traits and promote an understanding of the physiological basis of key agronomic and nutritional quality traits. Marker-aided selection and genome-wide selection schemes are being implemented to accelerate genetic gain relating to yield, resilience, and nutritional quality. Therefore, strategies that effectively combine genotypic information with data from field phenotyping and laboratory-based analysis are currently being optimized. Molecular breeding, guided by methodically defined product profiles tailored to different agroecological zones and conditions of climate change, supported by state-of-the-art decision-making tools, is pivotal for the advancement of modern, genomics-aided maize improvement programs. Accelerated genetic gain, in turn, catalyzes a faster variety replacement rate. It is critical to forge and strengthen partnerships for enhancing the impacts of breeding products on farmers’ livelihood. IITA has well-established channels for delivering its research products/technologies to partner organizations for further testing, multiplication, and dissemination across various countries within the subregion. Capacity building of national agricultural research system (NARS) will facilitate the smooth transfer of technologies and best practices from IITA and its partners

    Learning from data: Plant breeding applications of machine learning

    Get PDF
    Increasingly, new sources of data are being incorporated into plant breeding pipelines. Enormous amounts of data from field phenomics and genotyping technologies places data mining and analysis into a completely different level that is challenging from practical and theoretical standpoints. Intelligent decision-making relies on our capability of extracting from data useful information that may help us to achieve our goals more efficiently. Many plant breeders, agronomists and geneticists perform analyses without knowing relevant underlying assumptions, strengths or pitfalls of the employed methods. The study endeavors to assess statistical learning properties and plant breeding applications of supervised and unsupervised machine learning techniques. A soybean nested association panel (aka. SoyNAM) was the base-population for experiments designed in situ and in silico. We used mixed models and Markov random fields to evaluate phenotypic-genotypic-environmental associations among traits and learning properties of genome-wide prediction methods. Alternative methods for analyses were proposed

    Genome-Wide Analysis of Grain Yield Stability and Environmental Interactions in a Multiparental Soybean Population

    Get PDF
    Genetic improvement toward optimized and stable agronomic performance of soybean genotypes is desirable for food security. Understanding how genotypes perform in different environmental conditions helps breeders develop sustainable cultivars adapted to target regions. Complex traits of importance are known to be controlled by a large number of genomic regions with small effects whose magnitude and direction are modulated by environmental factors. Knowledge of the constraints and undesirable effects resulting from genotype by environmental interactions is a key objective in improving selection procedures in soybean breeding programs. In this study, the genetic basis of soybean grain yield responsiveness to environmental factors was examined in a large soybean nested association population. For this, a genome-wide association to performance stability estimates generated from a Finlay-Wilkinson analysis and the inclusion of the interaction between marker genotypes and environmental factors was implemented. Genomic footprints were investigated by analysis and meta-analysis using a recently published multiparent model. Results indicated that specific soybean genomic regions were associated with stability, and that multiplicative interactions were present between environments and genetic background. Seven genomic regions in six chromosomes were identified as being associated with genotype-by-environment interactions. This study provides insight into genomic assisted breeding aimed at achieving a more stable agronomic performance of soybean, and documented opportunities to exploit genomic regions that were specifically associated with interactions involving environments and subpopulations

    Harnessing genetic potential of wheat germplasm banks through impact-oriented-prebreeding for future food and nutritional security

    Get PDF
    The value of exotic wheat genetic resources for accelerating grain yield gains is largely unproven and unrealized. We used next-generation sequencing, together with multi-environment phenotyping, to study the contribution of exotic genomes to 984 three-way-cross-derived (exotic/elite1//elite2) pre-breeding lines (PBLs). Genomic characterization of these lines with haplotype map-based and SNP marker approaches revealed exotic specific imprints of 16.1 to 25.1%, which compares to theoretical expectation of 25%. A rare and favorable haplotype (GT) with 0.4% frequency in gene bank identified on chromosome 6D minimized grain yield (GY) loss under heat stress without GY penalty under irrigated conditions. More specifically, the ‘T’ allele of the haplotype GT originated in Aegilops tauschii and was absent in all elite lines used in study. In silico analysis of the SNP showed hits with a candidate gene coding for isoflavone reductase IRL-like protein in Ae. tauschii. Rare haplotypes were also identified on chromosomes 1A, 6A and 2B effective against abiotic/biotic stresses. Results demonstrate positive contributions of exotic germplasm to PBLs derived from crosses of exotics with CIMMYT’s best elite lines. This is a major impact-oriented pre-breeding effort at CIMMYT, resulting in large-scale development of PBLs for deployment in breeding programs addressing food security under climate change scenarios

    Association Mapping and Genomic Selection for Yield and Agronomic Traits in Soft Winter Wheat

    Get PDF
    Tools such as genome-wide association study (GWAS) and genomic selection (GS) have expedited the development of crops with improved genetic potential. While GWAS aims to identify significant markers associated with a trait of interest, the goal of GS is to utilize all marker effects to predict the performance of new breeding lines prior to testing. A GWAS for grain yield (GY), yield components, and agronomic traits was conducted using a diverse panel of 239 soft winter wheat (SWW) lines evaluated in eight site-years in Arkansas and Oklahoma. Broad sense heritability of GY (H2=0.48) was moderate compared to other traits including plant height (H2=0.81) and kernel weight (H2=0.77). Markers associated with multiple traits on chromosomes 1A, 2D, 3B, and 4B serve as potential targets for marker assisted breeding to select for GY improvement. Validation of GY-related loci using spring wheat from the International Maize and Wheat Improvement Center (CIMMYT) in Mexico confirmed the effects of three loci in chromosomes 3A, 4B, and 6B. Lines possessing the favorable allele at all three loci (A-C-G allele combination) had the highest mean GY of possible haplotypes. The same population of 239 lines was used in a GS study as a training population (TP) to determine factors that affect the predictability of GY. The TP size had the greatest effect on predictive ability across the measured traits. Adding covariates in the GS model was more advantageous in increasing prediction accuracies under single population cross validations than in forward predictions. Forward validation of the prediction models on two new populations resulted in a maximum accuracy of 0.43 for GY. Genomic selection was “superior” to marker-assisted selection in terms of response to selection and combining phenotypic selection with GS resulted in the highest response. Results from this study can be used to accelerate the process of GY improvement and increase genetic gains in wheat breeding programs

    Low coverage sequencing for repetitive DNA analysis in Passiflora edulis Sims: Citogenomic characterization of transposable elements and satellite DNA

    Full text link
    Background: The cytogenomic study of repetitive regions is fundamental for the understanding of morphofunctional mechanisms and genome evolution. Passiflora edulis a species of relevant agronomic value, this work had its genome sequenced by next generation sequencing and bioinformatics analysis performed by RepeatExplorer pipeline. The clusters allowed the identification and characterization of repetitive elements (predominant contributors to most plant genomes). The aim of this study was to identify, characterize and map the repetitive DNA of P. edulis, providing important cytogenomic markers, especially sequences associated with the centromere. Results: Three clusters of satellite DNAs (69, 118 and 207) and seven clusters of Long Terminal Repeat (LTR) retrotransposons of the superfamilies Ty1/Copy and Ty3/Gypsy and families Angela, Athila, Chromovirus and Maximus-Sire (6, 11, 36, 43, 86, 94 and 135) were characterized and analyzed. The chromosome mapping of satellite DNAs showed two hybridization sites co-located in the 5S rDNA region (PeSat_1), subterminal hybridizations (PeSat_3) and hybridization in four sites, co-located in the 45S rDNA region (PeSat_2). Most of the retroelements hybridizations showed signals scattered in the chromosomes, diverging in abundance, and only the cluster 6 presented pericentromeric regions marking. No satellite DNAs and retroelement associated with centromere was observed. Conclusion: P. edulis has a highly repetitive genome, with the predominance of Ty3/Gypsy LTR retrotransposon. The satellite DNAs and LTR retrotransposon characterized are promising markers for investigation of the evolutionary patterns and genetic distinction of species and hybrids of Passiflora
    • 

    corecore