82 research outputs found
Recommended from our members
Contribution of Transcription Factor Binding Site Motif Variants to Condition-Specific Gene Expression Patterns in Budding Yeast
It is now experimentally well known that variant sequences of a cis transcription factor binding site motif can contribute to differential regulation of genes. We characterize the relationship between motif variants and gene expression by analyzing expression microarray data and binding site predictions. To accomplish this, we statistically detect motif variants with effects that differ among environments. Such environmental specificity may be due to either affinity differences between variants or, more likely, differential interactions of TFs bound to these variants with cofactors, and with differential presence of cofactors across environments. We examine conservation of functional variants across four Saccharomyces species, and find that about a third of transcription factors have target genes that are differentially expressed in a condition-specific manner that is correlated with the nucleotide at variant motif positions. We find good correspondence between our results and some cases in the experimental literature (Reb1, Sum1, Mcm1, and Rap1). These results and growing consensus in the literature indicates that motif variants may often be functionally distinct, that this may be observed in genomic data, and that variants play an important role in condition-specific gene regulation.</p
Recommended from our members
Revisiting an Old Riddle: What Determines Genetic Diversity Levels within Species?
Understanding why some species have more genetic diversity than others is central to the study of ecology and evolution, and carries potentially important implications for conservation biology. Yet not only does this question remain unresolved, it has largely fallen into disregard. With the rapid decrease in sequencing costs, we argue that it is time to revive it.</p
The Timing of Selection at the Human FOXP2 Gene
Krause J, Lalueza-Fox C, Orlando L, et al. recently examined patterns of genetic variation at FOXP2 in 2 Neanderthals. This gene is of particular interest because it is involved in speech and language and was previously shown to harbor the signature of recent positive selection. The authors found the same 2 amino acid substitutions in Neanderthals as in modern humans. Assuming that these sites were the targets of selection and no interbreeding between the 2 groups, they concluded that selection at FOXP2 occurred before the populations split, over 300 thousand years ago. Here, we show that the data are unlikely under this scenario but may instead be consistent with low rates of gene flow between modern humans and Neanderthals. We also collect additional data and introduce a modeling framework to estimate levels of modern human contamination of the Neanderthal samples. We find that, depending on the assumptions, additional control experiments may be needed to rule out contamination at FOXP2
Protein Rates of Evolution Are Predicted by Double-Strand Break Events, Independent of Crossing-over Rates
Theory predicts that, owing to reduced Hill–Robertson interference, genomic regions with high crossing-over rates should experience more efficient selection. In Saccharomyces cerevisiae a negative correlation between the local recombination rate, assayed as meiotic double-strand breaks (DSBs), and the local rate of protein evolution has been considered consistent with such a model. Although DSBs are a prerequisite for crossing-over, they need not result in crossing-over. With recent high-resolution crossover data, we now return to this issue comparing two species of yeast. Strikingly, even allowing for crossover rates, both the rate of premeiotic DSBs and of noncrossover recombination events predict a gene's rate of evolution. This both questions the validity of prior analyses and strongly suggests that any correlation between crossover rates and rates of protein evolution could be owing to slow-evolving genes being prone to DSBs or a direct effect of DSBs on sequence evolution. To ask if classical theory of recombination has any relevance, we determine whether crossover rates predict rates of protein evolution, controlling for noncrossover DSB events, gene ontology (GO) class, gene expression, protein abundance, nucleotide content, and dispensability. We find that genes with high crossing-over rates have low rates of protein evolution after such control, although any correlation is weaker than that previously reported considering meiotic DSBs as a proxy. The data are consistent both with recombination enhancing the efficiency of purifying selection and, independently, with DSBs being associated with low rates of evolution
Genomic Determinants of Protein Evolution and Polymorphism in Arabidopsis
Recent results from Drosophila suggest that positive selection has a substantial impact on genomic patterns of polymorphism and divergence. However, species with smaller population sizes and/or stronger population structure may not be expected to exhibit Drosophila-like patterns of sequence variation. We test this prediction and identify determinants of levels of polymorphism and rates of protein evolution using genomic data from Arabidopsis thaliana and the recently sequenced Arabidopsis lyrata genome. We find that, in contrast to Drosophila, there is no negative relationship between nonsynonymous divergence and silent polymorphism at any spatial scale examined. Instead, synonymous divergence is a major predictor of silent polymorphism, which suggests variation in mutation rate as the main determinant of silent variation. Variation in rates of protein divergence is mainly correlated with gene expression level and breadth, consistent with results for a broad range of taxa, and map-based estimates of recombination rate are only weakly correlated with nonsynonymous divergence. Variation in mutation rates and the strength of purifying selection seem to be major drivers of patterns of polymorphism and divergence in Arabidopsis. Nevertheless, a model allowing for varying negative and positive selection by functional gene category explains the data better than a homogeneous model, implying the action of positive selection on a subset of genes. Genes involved in disease resistance and abiotic stress display high proportions of adaptive substitution. Our results are important for a general understanding of the determinants of rates of protein evolution and the impact of selection on patterns of polymorphism and divergence
Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWAS
Although genome-wide association studies (GWAS) of complex traits have yielded more reproducible associations than had been discovered using any other approach, the loci characterized to date do not account for much of the heritability to such traits and, in general, have not led to improved understanding of the biology underlying complex phenotypes. Using a web site we developed to serve results of expression quantitative trait locus (eQTL) studies in lymphoblastoid cell lines from HapMap samples (http://www.scandb.org), we show that single nucleotide polymorphisms (SNPs) associated with complex traits (from http://www.genome.gov/gwastudies/) are significantly more likely to be eQTLs than minor-allele-frequency–matched SNPs chosen from high-throughput GWAS platforms. These findings are robust across a range of thresholds for establishing eQTLs (p-values from 10−4–10−8), and a broad spectrum of human complex traits. Analyses of GWAS data from the Wellcome Trust studies confirm that annotating SNPs with a score reflecting the strength of the evidence that the SNP is an eQTL can improve the ability to discover true associations and clarify the nature of the mechanism driving the associations. Our results showing that trait-associated SNPs are more likely to be eQTLs and that application of this information can enhance discovery of trait-associated SNPs for complex phenotypes raise the possibility that we can utilize this information both to increase the heritability explained by identifiable genetic factors and to gain a better understanding of the biology underlying complex traits
Integration of GWAS SNPs and tissue specific expression profiling reveal discrete eQTLs for human traits in blood and brain
Our knowledge of the transcriptome has become much more complex since the days of
the central dogma of molecular biology. We now know that splicing takes place to
create potentially thousands of isoforms from a single gene, and we know that RNA
does not always faithfully recapitulate DNA if RNA editing occurs. Collectively, these
observations show that the transcriptome is amazingly rich with intricate regulatory
mechanisms for overall gene expression, splicing, and RNA editing.
Genetic variability can play a role in controlling gene expression, which can be
identified by examining expression quantitative trait loci (eQTLs). eQTLs are genomic
regions where genetic variants, including single nucleotide polymorphisms (SNPs)
show a statistical association with expression of mRNA transcripts. In humans, many
SNPs are also associated with disease, and have been identified using genome wide
association studies (GWAS) but the biological effects of those SNPs are usually not
known. If SNPs found in GWAS are also found in eQTLs, then one could hypothesize
that expression levels may contribute to disease risk. Performing eQTL analysis with
GWAS SNPs in both blood and brain, specifically the frontal cortex and the
cerebellum, we found both shared and tissue unique eQTLS. The identification of
tissue-unique eQTLs supports the argument that choice of tissue type is important in
eQTL studies (Paper I).
Aging is a complex process with the mechanisms underlying aging still being poorly
defined. There is evidence that the transcriptome changes with age, and hence we used
the brain dataset from our first paper as a discovery set, with an additional replication
dataset, to investigate any aging-gene expression associations. We found evidence that
many genes were associated with aging. We further found that there were more
statically significant expression changes in the frontal cortex versus the cerebellum,
indicating that brain regions may age at different rates. As the brain is a heterogeneous
tissue including both neurons and non-neuronal cells, we used LCM to capture Purkinje
cells as a representative neuronal type and repeated the age analysis. Looking at the
discovery, replication and Purkinje cell datasets we found five genes with strong,
replicated evidence of age-expression associations (Paper II).
Being able to capture and quantify the depth of the transcriptome has been a lengthy
process starting with methods that could only measure a single gene to genome-wide
techniques such as microarray. A recently developed technology, RNA-Seq, shows
promise in its ability to capture expression, splicing, and editing and with its broad
dynamic range quantification is accurate and reliable. RNA-Seq is, however, data
intensive and a great deal of computational expertise is required to fully utilize the
strengths of this method. We aimed to create a small, well-controlled, experiment in
order to test the performance of this relatively new technology in the brain. We chose
embryonic versus adult cerebral cortex, as mice are genetically homogenous and there
are many known differences in gene expression related to brain development that we
could use as benchmarks for analysis testing. We found a large number of differences
in total gene expression between embryonic and adult brain. Rigorous technical and
biological validation illustrated the accuracy and dynamic range of RNA-Seq. We were also able to interrogate differences in exon usage in the same dataset. Finally we
were able to identify and quantify both well-known and novel A-to-I edit sites. Overall
this project helped us develop the tools needed to build usable pipelines for RNA-Seq
data processing (Paper III).
Our studies in the developing brain (Paper III) illustrated that RNA-Seq was a useful
unbiased method for investigating RNA editing. To extend this further, we utilized a
genetically modified mouse model to study the transcriptomic role of the RNA editing
enzyme ADAR2. We found that ADAR2 was important for editing of the coding
region of mRNA as a large proportion of RNA editing sites in coding regions had a
statistically significant decrease in editing percentages in Adar2
-/-Gria2
R/R
mice versus
controls. However, despite indications in the literature that ADAR2 may also be
involved in splicing and expression regulatory machinery we found no changes in gene
expression or exon utilization in Adar2
-/-Gria2
R/R
mice as compared to their littermate
controls (Paper IV).
In our final study, based on the methods developed in Papers III and IV, we revisited
the idea of age related gene expression associations from Paper II. We used a subset of
human frontal cortices for RNA sequencing. Interestingly we found more gene
expression changes with aging compared to the previous data using microarrays in
Paper II. When the significant gene lists were analysed for gene ontology enrichment,
we found that there was a large number of downregulated genes involved in synaptic
function while those that were upregulated had enrichment in immune function. This
dataset illustrates that the aging brain may be predisposed to the processes found in
neurodegenerative diseases (Paper V)
Integrative genomic analysis of the human immune response to influenza vaccination
Identification of the host genetic factors that contribute to variation in vaccine responsiveness may uncover important mechanisms affecting vaccine efficacy. We carried out an integrative, longitudinal study combining genetic, transcriptional, and immunologic data in humans given seasonal influenza vaccine. We identified 20 genes exhibiting a transcriptional response to vaccination, significant genotype effects on gene expression, and correlation between the transcriptional and antibody responses. The results show that variation at the level of genes involved in membrane trafficking and antigen processing significantly influences the human response to influenza vaccination. More broadly, we demonstrate that an integrative study design is an efficient alternative to existing methods for the identification of genes involved in complex traits. DOI: http://dx.doi.org/10.7554/eLife.00299.00
Determinants of the efficacy of natural selection on coding and noncoding variability in two passerine species
Population genetic theory predicts that selection should be more effective when the effective population size (Ne) is larger, and that the efficacy of selection should correlate positively with recombination rate. Here, we analyzed the genomes of ten great tits and ten zebra finches. Nucleotide diversity at 4-fold degenerate sites indicates that zebra finches have a 2.83-fold larger Ne. We obtained clear evidence that purifying selection is more effective in zebra finches. The proportion of substitutions at 0-fold degenerate sites fixed by positive selection (α) is high in both species (great tit 48%; zebra finch 64%) and is significantly higher in zebra finches. When α was estimated on GC-conservative changes (i.e., between A and T and between G and C), the estimates reduced in both species (great tit 22%; zebra finch 53%). A theoretical model presented herein suggests that failing to control for the effects of GC-biased gene conversion (gBGC) is potentially a contributor to the overestimation of α, and that this effect cannot be alleviated by first fitting a demographic model to neutral variants. We present the first estimates in birds for α in the untranslated regions, and found evidence for substantial adaptive changes. Finally, although purifying selection is stronger in high-recombination regions, we obtained mixed evidence for α increasing with recombination rate, especially after accounting for gBGC. These results highlight that it is important to consider the potential confounding effects of gBGC when quantifying selection and that our understanding of what determines the efficacy of selection is incomplete
- …