110 research outputs found
Bayesian Sparse Factor Analysis of Genetic Covariance Matrices
Quantitative genetic studies that model complex, multivariate phenotypes are
important for both evolutionary prediction and artificial selection. For
example, changes in gene expression can provide insight into developmental and
physiological mechanisms that link genotype and phenotype. However, classical
analytical techniques are poorly suited to quantitative genetic studies of gene
expression where the number of traits assayed per individual can reach many
thousand. Here, we derive a Bayesian genetic sparse factor model for estimating
the genetic covariance matrix (G-matrix) of high-dimensional traits, such as
gene expression, in a mixed effects model. The key idea of our model is that we
need only consider G-matrices that are biologically plausible. An organism's
entire phenotype is the result of processes that are modular and have limited
complexity. This implies that the G-matrix will be highly structured. In
particular, we assume that a limited number of intermediate traits (or factors,
e.g., variations in development or physiology) control the variation in the
high-dimensional phenotype, and that each of these intermediate traits is
sparse -- affecting only a few observed traits. The advantages of this approach
are two-fold. First, sparse factors are interpretable and provide biological
insight into mechanisms underlying the genetic architecture. Second, enforcing
sparsity helps prevent sampling errors from swamping out the true signal in
high-dimensional data. We demonstrate the advantages of our model on simulated
data and in an analysis of a published Drosophila melanogaster gene expression
data set.Comment: 35 pages, 7 figure
Dissecting high-dimensional phenotypes with bayesian sparse factor analysis of genetic covariance matrices.
Quantitative genetic studies that model complex, multivariate phenotypes are important for both evolutionary prediction and artificial selection. For example, changes in gene expression can provide insight into developmental and physiological mechanisms that link genotype and phenotype. However, classical analytical techniques are poorly suited to quantitative genetic studies of gene expression where the number of traits assayed per individual can reach many thousand. Here, we derive a Bayesian genetic sparse factor model for estimating the genetic covariance matrix (G-matrix) of high-dimensional traits, such as gene expression, in a mixed-effects model. The key idea of our model is that we need consider only G-matrices that are biologically plausible. An organism's entire phenotype is the result of processes that are modular and have limited complexity. This implies that the G-matrix will be highly structured. In particular, we assume that a limited number of intermediate traits (or factors, e.g., variations in development or physiology) control the variation in the high-dimensional phenotype, and that each of these intermediate traits is sparse - affecting only a few observed traits. The advantages of this approach are twofold. First, sparse factors are interpretable and provide biological insight into mechanisms underlying the genetic architecture. Second, enforcing sparsity helps prevent sampling errors from swamping out the true signal in high-dimensional data. We demonstrate the advantages of our model on simulated data and in an analysis of a published Drosophila melanogaster gene expression data set
Translational regulation contributes to the elevated CO2 response in two Solanum species.
Understanding the impact of elevated CO2 (eCO2 ) in global agriculture is important given climate change projections. Breeding climate-resilient crops depends on genetic variation within naturally varying populations. The effect of genetic variation in response to eCO2 is poorly understood, especially in crop species. We describe the different ways in which Solanum lycopersicum and its wild relative S. pennellii respond to eCO2 , from cell anatomy, to the transcriptome, and metabolome. We further validate the importance of translational regulation as a potential mechanism for plants to adaptively respond to rising levels of atmospheric CO2
Recommended from our members
Large-effect flowering time mutations reveal conditionally adaptive paths through fitness landscapes in Arabidopsis thaliana.
Contrary to previous assumptions that most mutations are deleterious, there is increasing evidence for persistence of large-effect mutations in natural populations. A possible explanation for these observations is that mutant phenotypes and fitness may depend upon the specific environmental conditions to which a mutant is exposed. Here, we tested this hypothesis by growing large-effect flowering time mutants of Arabidopsis thaliana in multiple field sites and seasons to quantify their fitness effects in realistic natural conditions. By constructing environment-specific fitness landscapes based on flowering time and branching architecture, we observed that a subset of mutations increased fitness, but only in specific environments. These mutations increased fitness via different paths: through shifting flowering time, branching, or both. Branching was under stronger selection, but flowering time was more genetically variable, pointing to the importance of indirect selection on mutations through their pleiotropic effects on multiple phenotypes. Finally, mutations in hub genes with greater connectedness in their regulatory networks had greater effects on both phenotypes and fitness. Together, these findings indicate that large-effect mutations may persist in populations because they influence traits that are adaptive only under specific environmental conditions. Understanding their evolutionary dynamics therefore requires measuring their effects in multiple natural environments
Recommended from our members
Functional variants of DOG1 control seed chilling responses and variation in seasonal life-history strategies in Arabidopsis thaliana.
The seasonal timing of seed germination determines a plant's realized environmental niche, and is important for adaptation to climate. The timing of seasonal germination depends on patterns of seed dormancy release or induction by cold and interacts with flowering-time variation to construct different seasonal life histories. To characterize the genetic basis and climatic associations of natural variation in seed chilling responses and associated life-history syndromes, we selected 559 fully sequenced accessions of the model annual species Arabidopsis thaliana from across a wide climate range and scored each for seed germination across a range of 13 cold stratification treatments, as well as the timing of flowering and senescence. Germination strategies varied continuously along 2 major axes: 1) Overall germination fraction and 2) induction vs. release of dormancy by cold. Natural variation in seed responses to chilling was correlated with flowering time and senescence to create a range of seasonal life-history syndromes. Genome-wide association identified several loci associated with natural variation in seed chilling responses, including a known functional polymorphism in the self-binding domain of the candidate gene DOG1. A phylogeny of DOG1 haplotypes revealed ancient divergence of these functional variants associated with periods of Pleistocene climate change, and Gradient Forest analysis showed that allele turnover of candidate SNPs was significantly associated with climate gradients. These results provide evidence that A. thaliana's germination niche and correlated life-history syndromes are shaped by past climate cycles, as well as local adaptation to contemporary climate
Genomic characterization of the evolutionary potential of the sea urchin Strongylocentrotus droebachiensis facing ocean acidification
Ocean acidification (OA) is increasing due to anthropogenic CO2 emissions and poses a threat to marine species and communities worldwide. To better project the effects of acidification on organisms’ health and persistence, an understanding is needed of the 1) mechanisms underlying developmental and physiological tolerance and 2) potential populations have for rapid evolutionary adaptation. This is especially challenging in nonmodel species where targeted assays of metabolism and stress physiology may not be available or economical for large-scale assessments of genetic constraints. We used mRNA sequencing and a quantitative genetics breeding design to study mechanisms underlying genetic variability and tolerance to decreased seawater pH (-0.4 pH units) in larvae of the sea urchin Strongylocentrotus droebachiensis. We used a gene ontology-based approach to integrate expression profiles into indirect measures of cellular and biochemical traits underlying variation in larval performance (i.e., growth rates). Molecular responses to OA were complex, involving changes to several functions such as growth rates, cell division, metabolism, and immune activities. Surprisingly, the magnitude of pH effects on molecular traits tended to be small relative to variation attributable to segregating functional genetic variation in this species. We discuss how the application of transcriptomics and quantitative genetics approaches across diverse species can enrich our understanding of the biological impacts of climate change
Uneven distribution of mutational variance across the transcriptome of Drosophila serrata revealed by high-dimensional analysis of gene expression
There are essentially an infinite number of traits that could be measured on any organism, and almost all individual traits display genetic variation, yet substantial genetic variance in a large number of independent traits is not plausible under basic models of selection and mutation. One mechanism that may be invoked to explain the observed levels of genetic variance in individual traits is that pleiotropy results in fewer dimensions of phenotypic space with substantial genetic variance. Multivariate genetic analyses of small sets of functionally-related traits have shown that standing genetic variance is often concentrated in relatively few dimensions. It is unknown if a similar concentration of genetic variance occurs at a phenome-wide scale when many traits of disparate function are considered, or if the genetic variance generated by new mutations is also unevenly distributed across phenotypic space. Here, we used a Bayesian sparse factor model to characterize the distribution of mutational variance of 3385 gene expression traits of after 27 generations of mutation accumulation, and found that 46% of the estimated mutational variance was concentrated in just 21 dimensions with significant mutational heritability. We show that the extent of concentration of mutational variance into such a small subspace has the potential to substantially bias the response to selection of these traits
Maintenance of quantitative genetic variance in complex, multitrait phenotypes:the contribution of rare, large effect variants in 2 Drosophila species
The interaction of evolutionary processes to determine quantitative genetic variation has implications for contemporary and future phenotypic evolution, as well as for our ability to detect causal genetic variants. While theoretical studies have provided robust predictions to discriminate among competing models, empirical assessment of these has been limited. In particular, theory highlights the importance of pleiotropy in resolving observations of selection and mutation, but empirical investigations have typically been limited to few traits. Here, we applied high-dimensional Bayesian Sparse Factor Genetic modeling to gene expression datasets in 2 species, Drosophila melanogaster and Drosophila serrata, to explore the distributions of genetic variance across high-dimensional phenotypic space. Surprisingly, most of the heritable trait covariation was due to few lines (genotypes) with extreme [>3 interquartile ranges (IQR) from the median] values. Intriguingly, while genotypes extreme for a multivariate factor also tended to have a higher proportion of individual traits that were extreme, we also observed genotypes that were extreme for multivariate factors but not for any individual trait. We observed other consistent differences between heritable multivariate factors with outlier lines vs those factors without extreme values, including differences in gene functions. We use these observations to identify further data required to advance our understanding of the evolutionary dynamics and nature of standing genetic variation for quantitative traits
- …
