10 research outputs found

    Sorghum Association Panel whole-genome sequencing establishes cornerstone resource for dissecting genomic diversity

    Get PDF
    Association mapping panels represent foundational resources for understanding the genetic basis of phenotypic diversity and serve to advance plant breeding by exploring genetic variation across diverse accessions. We report the whole-genome sequencing (WGS) of 400 sorghum (Sorghum bicolor (L.) Moench) accessions from the Sorghum Association Panel (SAP) at an average coverage of 38× (25–72×), enabling the development of a high-density genomic marker set of 43 983 694 variants including single-nucleotide polymorphisms (approximately 38 million), insertions/deletions (indels) (approximately 5 million), and copy number variants (CNVs) (approximately 170 000). We observe slightly more deletions among indels and a much higher prevalence of deletions among CNVs compared to insertions. This new marker set enabled the identification of several novel putative genomic associations for plant height and tannin content, which were not identified when using previous lower-density marker sets. WGS identified and scored variants in 5-kb bins where available genotyping-by-sequencing (GBS) data captured no variants, with half of all bins in the genome falling into this category. The predictive ability of genomic best unbiased linear predictor (GBLUP) models was increased by an average of 30% by using WGS markers rather than GBS markers. We identified 18 selection peaks across subpopulations that formed due to evolutionary divergence during domestication, and we found six Fst peaks resulting from comparisons between converted lines and breeding lines within the SAP that were distinct from the peaks associated with historic selection. This population has served and continues to serve as a significant public resource for sorghum research and demonstrates the value of improving upon existing genomic resources

    Functional genomic effects of indels using Bayesian genome-phenome wide association studies in sorghum

    Get PDF
    High-throughput genomic and phenomic data have enhanced the ability to detect genotype-to-phenotype associations that can resolve broad pleiotropic effects of mutations on plant phenotypes. As the scale of genotyping and phenotyping has advanced, rigorous methodologies have been developed to accommodate larger datasets and maintain statistical precision. However, determining the functional effects of associated genes/loci is expensive and limited due to the complexity associated with cloning and subsequent characterization. Here, we utilized phenomic imputation of a multi-year, multi-environment dataset using PHENIX which imputes missing data using kinship and correlated traits, and we screened insertions and deletions (InDels) from the recently whole-genome sequenced Sorghum Association Panel for putative loss-of-function effects. Candidate loci from genome-wide association results were screened for potential loss of function using a Bayesian Genome-Phenome Wide Association Study (BGPWAS) model across both functionally characterized and uncharacterized loci. Our approach is designed to facilitate in silico validation of associations beyond traditional candidate gene and literature-search approaches and to facilitate the identification of putative variants for functional analysis and reduce the incidence of false-positive candidates in current functional validation methods. Using this Bayesian GPWAS model, we identified associations for previously characterized genes with known loss-of-function alleles, specific genes falling within known quantitative trait loci, and genes without any previous genome-wide associations while additionally detecting putative pleiotropic effects. In particular, we were able to identify the major tannin haplotypes at the Tan1 locus and effects of InDels on the protein folding. Depending on the haplotype present, heterodimer formation with Tan2 was significantly affected. We also identified major effect InDels in Dw2 and Ma1, where proteins were truncated due to frameshift mutations that resulted in early stop codons. These truncated proteins also lost most of their functional domains, suggesting that these indels likely result in loss of function. Here, we show that the Bayesian GPWAS model is able to identify loss-of-function alleles that can have significant effects upon protein structure and folding as well as multimer formation. Our approach to characterize loss-of-function mutations and their functional repercussions will facilitate precision genomics and breeding by identifying key targets for gene editing and trait integration

    Meta-analysis identifies pleiotropic loci controlling phenotypic trade-offs in sorghum

    Get PDF
    Community association populations are composed of phenotypically and genetically diverse accessions. Once these populations are genotyped, the resulting marker data can be reused by different groups investigating the genetic basis of different traits. Because the same genotypes are observed and scored for a wide range of traits in different environments, these populations represent a unique resource to investigate pleiotropy. Here, we assembled a set of 234 separate trait datasets for the Sorghum Association Panel, a group of 406 sorghum genotypes widely employed by the sorghum genetics community. Comparison of genome-wide association studies (GWAS) conducted with two independently generated marker sets for this population demonstrate that existing genetic marker sets do not saturate the genome and likely capture only 35–43% of potentially detectable loci controlling variation for traits scored in this population. While limited evidence for pleiotropy was apparent in cross-GWAS comparisons, a multivariate adaptive shrinkage approach recovered both known pleiotropic effects of existing loci and new pleiotropic effects, particularly significant impacts of known dwarfing genes on root architecture. In addition, we identified new loci with pleiotropic effects consistent with known trade-offs in sorghum development. These results demonstrate the potential for mining existing trait datasets from widely used community association populations to enable new discoveries from existing trait datasets as new, denser genetic marker datasets are generated for existing community association populations

    Discovering useful genetic variation in the seed parent gene pool for sorghum improvement

    Get PDF
    Multi-parent populations contain valuable genetic material for dissecting complex, quantitative traits and provide a unique opportunity to capture multi-allelic variation compared to the biparental populations. A multi-parent advanced generation inter-cross (MAGIC) B-line (MBL) population composed of 708 F6 recombinant inbred lines (RILs), was recently developed from four diverse founders. These selected founders strategically represented the four most prevalent botanical races (kafir, guinea, durra, and caudatum) to capture a significant source of genetic variation to study the quantitative traits in grain sorghum [Sorghum bicolor (L.) Moench]. MBL was phenotyped at two field locations for seven yield-influencing traits: panicle type (PT), days to anthesis (DTA), plant height (PH), grain yield (GY), 1000-grain weight (TGW), tiller number per meter (TN) and yield per panicle (YPP). High phenotypic variation was observed for all the quantitative traits, with broad-sense heritabilities ranging from 0.34 (TN) to 0.84 (PH). The entire population was genotyped using Diversity Arrays Technology (DArTseq), and 8,800 single nucleotide polymorphisms (SNPs) were generated. A set of polymorphic, quality-filtered markers (3,751 SNPs) and phenotypic data were used for genome-wide association studies (GWAS). We identified 52 marker-trait associations (MTAs) for the seven traits using BLUPs generated from replicated plots in two locations. We also identified desirable allelic combinations based on the plant height loci (Dw1, Dw2, and Dw3), which influences yield related traits. Additionally, two novel MTAs were identified each on Chr1 and Chr7 for yield traits independent of dwarfing genes. We further performed a multi-variate adaptive shrinkage analysis and 15 MTAs with pleiotropic effect were identified. The five best performing MBL progenies were selected carrying desirable allelic combinations. Since the MBL population was designed to capture significant diversity for maintainer line (B-line) accessions, these progenies can serve as valuable resources to develop superior sorghum hybrids after validation of their general combining abilities via crossing with elite pollinators. Further, newly identified desirable allelic combinations can be used to enrich the maintainer germplasm lines through marker-assisted backcross breeding

    Genomics-Assisted Breeding for Grain Yield and Composition in Sorghum

    Get PDF
    Cereal grains provide over half of the total calories for human and animal nutrition. Sorghum [Sorghum bicolor (L.) Moench] is the fifth most important cereal grain in the world and a source of staple for over half a billion people in the semi-arid tropics. As human population is projected to become nine billion by middle of this century, crop production needs to increase by 70% to 100% to meet the increasing demand for food. The advancement in genomic technologies and their application in breeding has potential to assure food security. The objectives of this study was to explore application of whole genome markers in identifying marker trait associations, potential gene candidates associated with the traits, and evaluating prediction performance of whole genome regression models in sorghum. Grain yield and grain composition traits measured in multiple environments and populations were used in model training and cross-validation of prediction performance using different statistical approaches. In general, genomic prediction for grain yield components and grain composition showed moderate to high accuracy depending on trait genetic architecture. Prediction accuracy of yield components declined when population structure was controlled. Race explained up to 50% of covariance for grain and panicle traits, and subpopulation with high genetic diversity had higher prediction accuracy. The prediction accuracy of grain composition for multi-trait model increased by 30-40% on average over single-trait model, suggesting multi-trait models using traits strongly correlated can increase genetic gain. A novel genomic association for starch was identified ~52 Mb of chromosome 8, and five out of six associated variants were located within a heat shock protein 90, Sobic.008G111600. Multivariate association for starch and protein identified additional variants around 60 Mb of chromosome 4, including one within 5\u27UTR of a fatty acid desaturase gene, Sobic.004G260800. Our results show genomic prediction can improve accuracy of selection in sorghum breeding and multivariate analysis of correlated traits can benefit association and prediction models

    Identification of Novel Genomic Associations and Gene Candidates for Grain Starch Content in Sorghum

    No full text
    Starch accumulated in the endosperm of cereal grains as reserve energy for germination serves as a staple in human and animal nutrition. Unraveling genetic control for starch metabolism is important for breeding grains with high starch content. In this study, we used a sorghum association panel with 389 individuals and 141,557 single nucleotide polymorphisms (SNPs) to fit linear mixed models (LMM) for identifying genomic regions and potential candidate genes associated with starch content. Three associated genomic regions, one in chromosome (chr) 1 and two novel associations in chr-8, were identified using combination of LMM and Bayesian sparse LMM. All significant SNPs were located within protein coding genes, with SNPs ∼ 52 Mb of chr-8 encoding a Casperian strip membrane protein (CASP)-like protein (Sobic.008G111500) and a heat shock protein (HSP) 90 (Sobic.008G111600) that were highly expressed in reproductive tissues including within the embryo and endosperm. The HSP90 is a potential hub gene with gene network of 75 high-confidence first interactors that is enriched for five biochemical pathways including protein processing. The first interactors of HSP90 also showed high transcript abundance in reproductive tissues. The candidates of this study are likely involved in intricate metabolic pathways and represent candidate gene targets for source-sink activities and drought and heat stress tolerance during grain filling

    <i>In Silico</i> and Fluorescence <i>In Situ</i> Hybridization Mapping Reveals Collinearity between the <i>Pennisetum squamulatum</i> Apomixis Carrier-Chromosome and Chromosome 2 of Sorghum and Foxtail Millet

    No full text
    <div><p>Apomixis, or clonal propagation through seed, is a trait identified within multiple species of the grass family (<i>Poaceae</i>). The genetic locus controlling apomixis in <i>Pennisetum squamulatum</i> (syn <i>Cenchrus squamulatus</i>) and <i>Cenchrus ciliaris</i> (syn <i>Pennisetum ciliare</i>, buffelgrass) is the apospory-specific genomic region (ASGR). Previously, the ASGR was shown to be highly conserved but inverted in marker order between <i>P</i>. <i>squamulatum</i> and <i>C</i>. <i>ciliaris</i> based on fluorescence <i>in situ</i> hybridization (FISH) and varied in both karyotype and position of the ASGR on the ASGR-carrier chromosome among other apomictic <i>Cenchrus/Pennisetum</i> species. Using <i>in silico</i> transcript mapping and verification of physical positions of some of the transcripts via FISH, we discovered that the ASGR-carrier chromosome from <i>P</i>. <i>squamulatum</i> is collinear with chromosome 2 of foxtail millet and sorghum outside of the ASGR. The <i>in silico</i> ordering of the ASGR-carrier chromosome markers, previously unmapped in <i>P</i>. <i>squamulatum</i>, allowed for the identification of a backcross line with structural changes to the <i>P</i>. <i>squamulatum</i> ASGR-carrier chromosome derived from gamma irradiated pollen.</p></div
    corecore