13 research outputs found
Genetic Analysis of Central Carbon Metabolism Unveils an Amino Acid Substitution That Alters Maize NAD-Dependent Isocitrate Dehydrogenase Activity
Background: Central carbon metabolism (CCM) is a fundamental component of life. The participating genes and enzymes are thought to be structurally and functionally conserved across and within species. Association mapping utilizes a rich history of mutation and recombination to achieve high resolution mapping. Therefore, applying association mapping in maize (Zea mays ssp. mays), the most diverse model crop species, to study the genetics of CCM is a particularly attractive system. Methodology/Principal Findings: We used a maize diversity panel to test the CCM functional conservation. We found heritable variation in enzyme activity for every enzyme tested. One of these enzymes was the NAD-dependent isocitrate dehydrogenase (IDH, E.C. 1.1.1.41), in which we identified a novel amino-acid substitution in a phylogenetically conserved site. Using candidate gene association mapping, we identified that this non-synonymous polymorphism was associated with IDH activity variation. The proposed mechanism for the IDH activity variation includes additional components regulating protein level. With the comparison of sequences from maize and teosinte (Zea mays ssp. Parviglumis), the maize wild ancestor, we found that some CCM genes had also been targeted for selection during maize domestication. Conclusions/Significance: Our results demonstrate the efficacy of association mapping for dissecting natural variation in primary metabolic pathways. The considerable genetic diversity observed in maize CCM genes underlies heritable phenotypic variation in enzyme activities and can be useful to identify putative functional sites
Polymorphism effect size and allele frequencies.
<p>(A) The standardized effect size of a polymorphism (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004845#s4" target="_blank">Methods</a>) is negatively correlated with minor allele frequency. This correlation is probably due to both biological factors (e.g., large effects are both more likely to deleterious (Fisher 1930; Orr 1998) and more easily selected against than small ones, and thus are more likely to remain rare) and statistical ones (e.g., in order for a rare variant to explain enough variance to be detected in GWAS, it must have a large effect). Similar results were found in a previous analysis of maize inflorescence traits <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004845#pgen.1004845-Brown1" target="_blank">[12]</a>. (B) Minor allele frequency distributions for the different polymorphism classes of GWAS hits. Intergenic hits are strongly enriched for rare alleles. The bimodal distribution in both parts is due to the way NAM was constructed; specifically, since B73 is a parent in all 25 families, any polymorphisms with the rare allele in B73 have their frequency artificially boosted toward 0.5.</p
Distribution of non-genic GWAS hits as a function of gene distance.
<p>The number of SNPs at increasing distances from the nearest gene is plotted; CNVs are excluded due to their large size and the difficulty determining where many (especially insertions) actually occur. The input (whole genome) dataset shows a single peak at ∼25 kb away from a gene. The GWAS dataset, however, shows an additional peak at ∼1–5 kb (shaded), where one would expect to find promoters and short-range regulatory elements. Note that due to the log scale, each bin contains successively more nucleotides that make it appear that most SNPs are far from genes, when the reverse is actually true.</p
Number of polymorphisms found and variance explained for each trait.
<p>(A) Polymorphisms found per trait. Bars show the mean and standard deviation of markers found per iteration before (light bars) and after (dark bars) filtering for RMIP≥0.05 (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004845#s4" target="_blank">Methods</a>). The number of markers found tends to broadly mirror the genetic complexity of each trait, with metabolic traits having fewer markers found than complex, polygenic traits like plant architecture. The relative complexity within each category is less certain, but the pattern still probably holds to a first degree of approximation. (B) Variance explained per trait. For each trait, a general linear model incorporating a family term (for each of the 25 biparental families in NAM) and all SNPs that passed filtering (dark bars in (A)) was fit to the original Best Linear Unbiased Predictors (BLUPs) for each trait. Bars show the portion of total variance explained by the fitted SNPs as measured by adjusted R<sup>2</sup>.</p
Distribution of RNA expression.
<p>Transcript-specific RNA expression values from the Maize Gene Atlas <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004845#pgen.1004845-Sekhon1" target="_blank">[30]</a> were summed to determine total expression for each gene. The log-transformed distribution of maximum expression values are shown for the entire filtered gene set (solid line) or just genes with GWAS hits within 5 kb of their primary transcripts (dashed line); vertical lines indicate the median of each distribution. The GWAS-hit genes show a slight depletion (∼20%) of low-expressed genes. For comparison, the median expression of maize transcription factors in this dataset (as annotated on Grassius, <a href="http://grassius.org/" target="_blank">http://grassius.org/</a>) is indicated by an arrowhead. FPKM, Fragments Per Kilobase of transcript per Million mapped reads.</p
Phenotypes used in this study.
a<p><a href="http://www.panzea.org/lit/data_sets.html#phenos" target="_blank">http://www.panzea.org/lit/data_sets.html#phenos</a>; the joint-linkage model to create residuals for this data was provided courtesy of Sherry Flint-Garcia.</p><p>Phenotypes used in this study.</p
Different effects of the polymorphism classes.
<p>(A) Variance explained by polymorphism class. Genic and gene-proximal polymorphisms explain the largest amount of unique variation in each trait. Breaking the data into the two components that most influence variance explained—allele frequency (B) and polymorphism effect size (C)—reveals a negative correlation between them such that classes with larger effect sizes (e.g., intergenic) also tend to have rarer polymorphisms. (D) Pairwise p-values testing whether the distributions in (A-C) are significantly different from each other (two-sided Kolmogorov-Smirnov test); values <1×10<sup>−3</sup> are bolded.</p
Comparison of paralogous to nonparalogous genes.
<p>Maize paralogous genes (identified by Schnable & Freeling <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004845#pgen.1004845-Schnable1" target="_blank">[52]</a>) were examined for any differences from nonparalogous genes that might spuriously contribute to their enrichment in GWAS analyses. There are no strong differences in either minor allele frequency distribution (A) or linkage disequilibrium decay (B), and the slightly lower SNP density (C) (median 32.8 SNPs/kb versus 33.4 SNPs/kb for nonparalogous genes) would be expected to actually decrease the probability of hitting paralogous genes, albeit by a very small amount.</p
Genome-wide association of carbon and nitrogen metabolism in the maize nested association mapping population.
Carbon (C) and nitrogen (N) metabolism are critical to plant growth and development and are at the basis of crop yield and adaptation. We performed high-throughput metabolite analyses on over 12,000 samples from the nested association mapping population to identify genetic variation in C and N metabolism in maize (Zea mays ssp. mays). All samples were grown in the same field and used to identify natural variation controlling the levels of 12 key C and N metabolites, namely chlorophyll a, chlorophyll b, fructose, fumarate, glucose, glutamate, malate, nitrate, starch, sucrose, total amino acids, and total protein, along with the first two principal components derived from them. Our genome-wide association results frequently identified hits with single-gene resolution. In addition to expected genes such as invertases, natural variation was identified in key C4 metabolism genes, including carbonic anhydrases and a malate transporter. Unlike several prior maize studies, extensive pleiotropy was found for C and N metabolites. This integration of field-derived metabolite data with powerful mapping and genomics resources allows for the dissection of key metabolic pathways, providing avenues for future genetic improvement