188 research outputs found
Genes Identified by Visible Mutant Phenotypes Show Increased Bias toward One of Two Subgenomes of Maize
Not all genes are created equal. Despite being supported by sequence conservation
and expression data, knockout homozygotes of many genes show no visible effects,
at least under laboratory conditions. We have identified a set of maize
(Zea mays L.) genes which have been the subject of a
disproportionate share of publications recorded at MaizeGDB. We manually
anchored these “classical” maize genes to gene models in the B73
reference genome, and identified syntenic orthologs in other grass genomes. In
addition to proofing the most recent version 2 maize gene models, we show that a
subset of these genes, those that were identified by morphological phenotype
prior to cloning, are retained at syntenic locations throughout the grasses at
much higher levels than the average expressed maize gene, and are preferentially
found on the maize1 subgenome even with a duplicate copy is still retained on
the opposite subgenome. Maize1 is the subgenome that experienced less gene loss
following the whole genome duplication in maize lineage 5–12 million years
ago and genes located on this subgenome tend to be expressed at higher levels in
modern maize. Links to the web based software that supported our syntenic
analyses in the grasses should empower further research and support teaching
involving the history of maize genetic research. Our findings exemplify the
concept of “grasses as a single genetic system,” where what is
learned in one grass may be applied to another
RNA-Seq Based Analysis of Population Structure within the Maize Inbred B73
Recent reports have shown than many identically named genetic lines used in research around the world actually contain large amounts of uncharacterized genetic variation as a result of cross contamination of stocks, unintentional crossing, residual heterozygosity within original stocks, or de novo mutation. 27 public, large scale, RNA-seq datasets from 20 independent research groups around the world were used to assess variation within the maize (Zea mays ssp. mays) inbred B73, a four decade old variety which served as the reference genotype for the original maize genome sequencing project and is widely used in genetic, genomic, and phenotypic research. Several clearly distinct clades were identified among putatively B73 samples. A number of these clades were defined by the presence of clearly defined genomic blocks containing a haplotype which did not match the published B73 reference genome. The overall proportion of the maize genotype where multiple distinct haplotypes were observed across different research groups was approximately 2.3%. In some cases the relationship among B73 samples generated by different research groups recapitulated mentor/mentee relationships within the maize genetics community
Optimising the identification of causal variants across varying genetic architectures in crops
Association studies use statistical links between genetic markers and the phenotype variation across many individuals to identify genes controlling variation in the target phenotype. However, this approach, particularly conducted on a genome-wide scale (GWAS), has limited power to identify the genes responsible for variation in traits controlled by complex genetic architectures. In this study, we employ real-world genotype datasets from four crop species with distinct minor allele frequency distributions, population structures and linkage disequilibrium patterns. We demonstrate that different GWAS statistical approaches provide favourable trade-offs between power and accuracy for traits controlled by different types of genetic architectures. FarmCPU provides the most favourable outcomes for moderately complex traits while a Bayesian approach adopted from genomic prediction provides the most favourable outcomes for extremely complex traits. We assert that by estimating the complexity of genetic architectures for target traits and selecting an appropriate statistical approach for the degree of complexity detected, researchers can substantially improve the ability to dissect the genetic factors controlling complex traits such as flowering time, plant height and yield component
Functional Modeling of Plant Growth Dynamics
Recent advances in automated plant phenotyping have enabled the collection of time series measurements from the same plants of a wide range of traits at different developmental time scales. The availability of time series phenotypic datasets has increased interest in statistical approaches for comparing patterns of change among different plant genotypes and different treatment conditions. Two widely used methods of modeling growth with time are pointwise analysis of variance (ANOVA) and parametric sigmoidal curve fitting. Pointwise ANOVA yields discontinuous growth curves, which do not reflect the true dynamics of growth patterns in plants. In contrast, fitting a parametric model to a time series of observations does capture the trend of growth; however, these models require assumptions regarding the true pattern of plant growth. Depending on the species, treatment regime, and subset of the plant life cycle sampled, these assumptions will not always hold true. We have developed a different approach—functional ANOVA—which yields continuous growth curves without requiring assumptions regarding patterns of plant growth. We compared and validated this approach using data from an experiment measuring the growth of two maize (Zea mays L. ssp. mays) genotypes under two water availability treatments during a 21-d period. Functional ANOVA enables a nonparametric estimation of the dynamics of changes in plant traits with time without assumptions regarding curve shape. In addition to estimating smooth curves of trait values with time, functional ANOVA also estimates the derivatives of these curves, e.g., growth rates, simultaneously. Using two different subsampling strategies, we demonstrate that this functional ANOVA method enables the comparison of growth curves among plants phenotyped on non-overlapping days with little reduction in estimation accuracy. This means that functional ANOVA based approaches can allow larger numbers of samples and biological replicates to be scored in a single experiment given fixed amounts of phenotyping infrastructure and personnel
Recommended from our members
High Density Genetic Maps of Seashore Paspalum Using Genotyping-By-Sequencing and Their Relationship to The Sorghum Bicolor Genome.
As a step towards trait mapping in the halophyte seashore paspalum (Paspalum vaginatum Sw.), we developed an F1 mapping population from a cross between two genetically diverse and heterozygous accessions, 509022 and HI33. Progeny were genotyped using a genotyping-by-sequencing (GBS) approach and sequence reads were analyzed for single nucleotide polymorphisms (SNPs) using the UGbS-Flex pipeline. More markers were identified that segregated in the maternal parent (HA maps) compared to the paternal parent (AH maps), suggesting that 509022 had overall higher levels of heterozygosity than HI33. We also generated maps that consisted of markers that were heterozygous in both parents (HH maps). The AH, HA and HH maps each comprised more than 1000 markers. Markers formed 10 linkage groups, corresponding to the ten seashore paspalum chromosomes. Comparative analyses showed that each seashore paspalum chromosome was syntenic to and highly colinear with a single sorghum chromosome. Four inversions were identified, two of which were sorghum-specific while the other two were likely specific to seashore paspalum. These high-density maps are the first available genetic maps for seashore paspalum. The maps will provide a valuable tool for plant breeders and others in the Paspalum community to identify traits of interest, including salt tolerance
IsoSeq transcriptome assembly of C3 panicoid grasses provides tools to study evolutionary change in the Panicoideae
The number of plant species with genomic and transcriptomic data has been increasing rapidly. The grasses—Poaceae—have been well represented among species with published reference genomes. However, as a result the genomes of wild grasses are less frequently targeted by sequencing efforts. Sequence data from wild relatives of crop species in the grasses can aid the study of domestication, gene discovery for breeding and crop improvement, and improve our understanding of the evolution of C4 photosynthesis. Here, we used long-read sequencing technology to characterize the transcriptomes of three C3 panicoid grass species: Dichanthelium oligosanthes, Chasmanthium laxum, and Hymenachne amplexicaulis. Based on alignments to the sorghum genome, we estimate that assembled consensus transcripts from each species capture between 54.2% and 65.7% of the conserved syntenic gene space in grasses. Genes co-opted into C4 were also well represented in this dataset, despite concerns that because these genes might play roles unrelated to photosynthesis in the target species, they would be expressed at low levels and missed by transcript-based sequencing. A combined analysis using syntenic orthologous genes from grasses with published reference genomes and consensus long-read sequences from these wild species was consistent with previously published phylogenies. It is hoped that these data, targeting underrepresented classes of species within the PACMAD grasses— wild species and species utilizing C3 photosynthesis—will aid in future studies of domestication and C4 evolution by decreasing the evolutionary distance between C4 and C3 species within this clade, enabling more accurate comparisons associated with evolution of the C4 pathway
Temporal dynamics of maize plant growth, water use, and leaf water content using automated high throughput RGB and hyperspectral imaging
Automated collection of large scale plant phenotype datasets using high throughput imaging systems has the potential to alleviate current bottlenecks in data-driven plant breeding and crop improvement. In this study, we demonstrate the characterization of temporal dynamics of plant growth and water use, and leaf water content of two maize genotypes under two different water treatments. RGB (Red Green Blue) images are processed to estimate projected plant area, which are correlated with destructively measured plant shoot fresh weight (FW), dry weight (DW) and leaf area. Estimated plant FW and DW, along with pot weights, are used to derive daily plant water consumption and water use efficiency (WUE) of the individual plants. Hyperspectral images of plants are processed to extract plant leaf reflectance and correlate with leaf water content (LWC). Strong correlations are found between projected plant area and all three destructively measured plant parameters (R2 \u3e 0.95) at early growth stages. The correlations become weaker at later growth stages due to the large difference in plant structure between the two maize genotypes. Daily water consumption (or evapotranspiration) is largely determined by water treatment, whereas WUE (or biomass accumulation per unit of water used) is clearly determined by genotype, indicating a strong genetic control of WUE. LWC is successfully predicted with the hyperspectral images for both genotypes (R2 = 0.81 and 0.92). Hyperspectral imaging can be a very powerful tool to phenotype biochemical traits of the whole maize plants, complementing RGB for plant morphological trait analysis
High throughput in vivo analysis of plant leaf chemical properties using hyperspectral imaging
The possibility of predicting plant leaf chemical properties using hyperspectral images was studied. Sixty maize and 60 soybean plants were used, and two experiments were conducted: one with water limitation and the second with nutrient limitation, with the purpose of creating wide ranges of these chemical properties in plant leaf tissues. A hyperspectral imaging system with a spectral range from 550 to 1700 nm was used to acquire plant images in a high throughput fashion (plants placed on an automated conveyor belt). Leaf chemical properties were measured in the laboratory. Partial least squares regression was implemented on spectral data to successfully model and predict water content, micronutrient, and macronutrient concentrations
Comparative genomics with maize and other grasses: from genes to genomes!
Of all the major plant groups, the grasses, with the complete genomes of five species, are the best positioned to take advantage of comparative genomics to obtain insight into functional genetic elements. Of all the grasses, maize is the best characterized in terms of genetics, development, and evolution. We provide several examples of how the web-based comparative genomics system CoGe may be used to aid in the interpretation of the maize genome sequence. These examples include verifying gene models, identifying differences between genome assemblies, identifying conserved non-coding sequences, identifying syntenic regions between species and polyploidies, and identifying homeologs within maize and orthologs between maize and other grass genomes. In addition, a comprehensive list of orthologous gene sets is provided between maize and Sorghum, foxtail millet, rice, and Brachypodium
Escape from Preferential Retention Following Repeated Whole Genome Duplications in Plants
The well supported gene dosage hypothesis predicts that genes encoding proteins engaged in dose–sensitive interactions cannot be reduced back to single copies once all interacting partners are simultaneously duplicated in a whole genome duplication. The genomes of extant flowering plants are the result of many sequential rounds of whole genome duplication, yet the fraction of genomes devoted to encoding complex molecular machines does not increase as fast as expected through multiple rounds of whole genome duplications. Using parallel interspecies genomic comparisons in the grasses and crucifers, we demonstrate that genes retained as duplicates following a whole genome duplication have only a 50% chance of being retained as duplicates in a second whole genome duplication. Genes which fractionated to a single copy following a second whole genome duplication tend to be the member of a gene pair with less complex promoters, lower levels of expression, and to be under lower levels of purifying selection. We suggest the copy with lower levels of expression and less purifying selection contributes less to effective gene-product dosage and therefore is under less dosage constraint in future whole genome duplications, providing an explanation for why flowering plant genomes are not overrun with subunits of large dose–sensitive protein complexes
- …