11 research outputs found

    Comparison of paralogous to nonparalogous genes.

    No full text
    <p>Maize paralogous genes (identified by Schnable & Freeling <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004845#pgen.1004845-Schnable1" target="_blank">[52]</a>) were examined for any differences from nonparalogous genes that might spuriously contribute to their enrichment in GWAS analyses. There are no strong differences in either minor allele frequency distribution (A) or linkage disequilibrium decay (B), and the slightly lower SNP density (C) (median 32.8 SNPs/kb versus 33.4 SNPs/kb for nonparalogous genes) would be expected to actually decrease the probability of hitting paralogous genes, albeit by a very small amount.</p

    Distribution of RNA expression.

    No full text
    <p>Transcript-specific RNA expression values from the Maize Gene Atlas <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004845#pgen.1004845-Sekhon1" target="_blank">[30]</a> were summed to determine total expression for each gene. The log-transformed distribution of maximum expression values are shown for the entire filtered gene set (solid line) or just genes with GWAS hits within 5 kb of their primary transcripts (dashed line); vertical lines indicate the median of each distribution. The GWAS-hit genes show a slight depletion (∼20%) of low-expressed genes. For comparison, the median expression of maize transcription factors in this dataset (as annotated on Grassius, <a href="http://grassius.org/" target="_blank">http://grassius.org/</a>) is indicated by an arrowhead. FPKM, Fragments Per Kilobase of transcript per Million mapped reads.</p

    Polymorphism effect size and allele frequencies.

    No full text
    <p>(A) The standardized effect size of a polymorphism (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004845#s4" target="_blank">Methods</a>) is negatively correlated with minor allele frequency. This correlation is probably due to both biological factors (e.g., large effects are both more likely to deleterious (Fisher 1930; Orr 1998) and more easily selected against than small ones, and thus are more likely to remain rare) and statistical ones (e.g., in order for a rare variant to explain enough variance to be detected in GWAS, it must have a large effect). Similar results were found in a previous analysis of maize inflorescence traits <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004845#pgen.1004845-Brown1" target="_blank">[12]</a>. (B) Minor allele frequency distributions for the different polymorphism classes of GWAS hits. Intergenic hits are strongly enriched for rare alleles. The bimodal distribution in both parts is due to the way NAM was constructed; specifically, since B73 is a parent in all 25 families, any polymorphisms with the rare allele in B73 have their frequency artificially boosted toward 0.5.</p

    Distribution of non-genic GWAS hits as a function of gene distance.

    No full text
    <p>The number of SNPs at increasing distances from the nearest gene is plotted; CNVs are excluded due to their large size and the difficulty determining where many (especially insertions) actually occur. The input (whole genome) dataset shows a single peak at ∼25 kb away from a gene. The GWAS dataset, however, shows an additional peak at ∼1–5 kb (shaded), where one would expect to find promoters and short-range regulatory elements. Note that due to the log scale, each bin contains successively more nucleotides that make it appear that most SNPs are far from genes, when the reverse is actually true.</p

    Number of polymorphisms found and variance explained for each trait.

    No full text
    <p>(A) Polymorphisms found per trait. Bars show the mean and standard deviation of markers found per iteration before (light bars) and after (dark bars) filtering for RMIP≥0.05 (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004845#s4" target="_blank">Methods</a>). The number of markers found tends to broadly mirror the genetic complexity of each trait, with metabolic traits having fewer markers found than complex, polygenic traits like plant architecture. The relative complexity within each category is less certain, but the pattern still probably holds to a first degree of approximation. (B) Variance explained per trait. For each trait, a general linear model incorporating a family term (for each of the 25 biparental families in NAM) and all SNPs that passed filtering (dark bars in (A)) was fit to the original Best Linear Unbiased Predictors (BLUPs) for each trait. Bars show the portion of total variance explained by the fitted SNPs as measured by adjusted R<sup>2</sup>.</p

    Phenotypes used in this study.

    No full text
    a<p><a href="http://www.panzea.org/lit/data_sets.html#phenos" target="_blank">http://www.panzea.org/lit/data_sets.html#phenos</a>; the joint-linkage model to create residuals for this data was provided courtesy of Sherry Flint-Garcia.</p><p>Phenotypes used in this study.</p

    Different effects of the polymorphism classes.

    No full text
    <p>(A) Variance explained by polymorphism class. Genic and gene-proximal polymorphisms explain the largest amount of unique variation in each trait. Breaking the data into the two components that most influence variance explained—allele frequency (B) and polymorphism effect size (C)—reveals a negative correlation between them such that classes with larger effect sizes (e.g., intergenic) also tend to have rarer polymorphisms. (D) Pairwise p-values testing whether the distributions in (A-C) are significantly different from each other (two-sided Kolmogorov-Smirnov test); values <1×10<sup>−3</sup> are bolded.</p
    corecore