21 research outputs found
Integration of GWAS SNPs and tissue specific expression profiling reveal discrete eQTLs for human traits in blood and brain
Our knowledge of the transcriptome has become much more complex since the days of
the central dogma of molecular biology. We now know that splicing takes place to
create potentially thousands of isoforms from a single gene, and we know that RNA
does not always faithfully recapitulate DNA if RNA editing occurs. Collectively, these
observations show that the transcriptome is amazingly rich with intricate regulatory
mechanisms for overall gene expression, splicing, and RNA editing.
Genetic variability can play a role in controlling gene expression, which can be
identified by examining expression quantitative trait loci (eQTLs). eQTLs are genomic
regions where genetic variants, including single nucleotide polymorphisms (SNPs)
show a statistical association with expression of mRNA transcripts. In humans, many
SNPs are also associated with disease, and have been identified using genome wide
association studies (GWAS) but the biological effects of those SNPs are usually not
known. If SNPs found in GWAS are also found in eQTLs, then one could hypothesize
that expression levels may contribute to disease risk. Performing eQTL analysis with
GWAS SNPs in both blood and brain, specifically the frontal cortex and the
cerebellum, we found both shared and tissue unique eQTLS. The identification of
tissue-unique eQTLs supports the argument that choice of tissue type is important in
eQTL studies (Paper I).
Aging is a complex process with the mechanisms underlying aging still being poorly
defined. There is evidence that the transcriptome changes with age, and hence we used
the brain dataset from our first paper as a discovery set, with an additional replication
dataset, to investigate any aging-gene expression associations. We found evidence that
many genes were associated with aging. We further found that there were more
statically significant expression changes in the frontal cortex versus the cerebellum,
indicating that brain regions may age at different rates. As the brain is a heterogeneous
tissue including both neurons and non-neuronal cells, we used LCM to capture Purkinje
cells as a representative neuronal type and repeated the age analysis. Looking at the
discovery, replication and Purkinje cell datasets we found five genes with strong,
replicated evidence of age-expression associations (Paper II).
Being able to capture and quantify the depth of the transcriptome has been a lengthy
process starting with methods that could only measure a single gene to genome-wide
techniques such as microarray. A recently developed technology, RNA-Seq, shows
promise in its ability to capture expression, splicing, and editing and with its broad
dynamic range quantification is accurate and reliable. RNA-Seq is, however, data
intensive and a great deal of computational expertise is required to fully utilize the
strengths of this method. We aimed to create a small, well-controlled, experiment in
order to test the performance of this relatively new technology in the brain. We chose
embryonic versus adult cerebral cortex, as mice are genetically homogenous and there
are many known differences in gene expression related to brain development that we
could use as benchmarks for analysis testing. We found a large number of differences
in total gene expression between embryonic and adult brain. Rigorous technical and
biological validation illustrated the accuracy and dynamic range of RNA-Seq. We were also able to interrogate differences in exon usage in the same dataset. Finally we
were able to identify and quantify both well-known and novel A-to-I edit sites. Overall
this project helped us develop the tools needed to build usable pipelines for RNA-Seq
data processing (Paper III).
Our studies in the developing brain (Paper III) illustrated that RNA-Seq was a useful
unbiased method for investigating RNA editing. To extend this further, we utilized a
genetically modified mouse model to study the transcriptomic role of the RNA editing
enzyme ADAR2. We found that ADAR2 was important for editing of the coding
region of mRNA as a large proportion of RNA editing sites in coding regions had a
statistically significant decrease in editing percentages in Adar2
-/-Gria2
R/R
mice versus
controls. However, despite indications in the literature that ADAR2 may also be
involved in splicing and expression regulatory machinery we found no changes in gene
expression or exon utilization in Adar2
-/-Gria2
R/R
mice as compared to their littermate
controls (Paper IV).
In our final study, based on the methods developed in Papers III and IV, we revisited
the idea of age related gene expression associations from Paper II. We used a subset of
human frontal cortices for RNA sequencing. Interestingly we found more gene
expression changes with aging compared to the previous data using microarrays in
Paper II. When the significant gene lists were analysed for gene ontology enrichment,
we found that there was a large number of downregulated genes involved in synaptic
function while those that were upregulated had enrichment in immune function. This
dataset illustrates that the aging brain may be predisposed to the processes found in
neurodegenerative diseases (Paper V)
The diversity and evolution of pollination systems in large plant clades: Apocynaceae as a case study
Background and Aims Large clades of angiosperms are often characterized by diverse interactions with pollinators, but how these pollination systems are structured phylogenetically and biogeographically is still uncertain for most families. Apocynaceae is a clade of >5300 species with a worldwide distribution. A database representing >10 % of species in the family was used to explore the diversity of pollinators and evolutionary shifts in pollination systems across major clades and regions. Methods The database was compiled from published and unpublished reports. Plants were categorized into broad pollination systems and then subdivided to include bimodal systems. These were mapped against the five major divisions of the family, and against the smaller clades. Finally, pollination systems were mapped onto a phylogenetic reconstruction that included those species for which sequence data are available, and transition rates between pollination systems were calculated. Key Results Most Apocynaceae are insect pollinated with few records of bird pollination. Almost three-quarters of species are pollinated by a single higher taxon (e.g. flies or moths); 7 % have bimodal pollination systems, whilst the remaining approx. 20 % are insect generalists. The less phenotypically specialized flowers of the Rauvolfioids are pollinated by a more restricted set of pollinators than are more complex flowers within the Apocynoids + Periplocoideae + Secamonoideae + Asclepiadoideae (APSA) clade. Certain combinations of bimodal pollination systems are more common than others. Some pollination systems are missing from particular regions, whilst others are over-represented. Conclusions Within Apocynaceae, interactions with pollinators are highly structured both phylogenetically and biogeographically. Variation in transition rates between pollination systems suggest constraints on their evolution, whereas regional differences point to environmental effects such as filtering of certain pollinators from habitats. This is the most extensive analysis of its type so far attempted and gives important insights into the diversity and evolution of pollination systems in large clades
Regulatory sites for splicing in human basal ganglia are enriched for disease-relevant information
Genome-wide association studies have generated an increasing number of common genetic variants associated with neurological and psychiatric disease risk. An improved understanding of the genetic control of gene expression in human brain is vital considering this is the likely modus operandum for many causal variants. However, human brain sampling complexities limit the explanatory power of brain-related expression quantitative trait loci (eQTL) and allele-specific expression (ASE) signals. We address this, using paired genomic and transcriptomic data from putamen and substantia nigra from 117 human brains, interrogating regulation at different RNA processing stages and uncovering novel transcripts. We identify disease-relevant regulatory loci, find that splicing eQTLs are enriched for regulatory information of neuron-specific genes, that ASEs provide cell-specific regulatory information with evidence for cellular specificity, and that incomplete annotation of the brain transcriptome limits interpretation of risk loci for neuropsychiatric disease. This resource of regulatory data is accessible through our web server, http://braineacv2.inf.um.es/
Recommended from our members
Identification of candidate Parkinson disease genes by integrating genome-wide association study, expression, and epigenetic data sets
Importance Substantial genome-wide association study (GWAS) work in Parkinson disease (PD) has led to the discovery of an increasing number of loci shown reliably to be associated with increased risk of disease. Improved understanding of the underlying genes and mechanisms at these loci will be key to understanding the pathogenesis of PD.
Objective To investigate what genes and genomic processes underlie the risk of sporadic PD.
Design and Setting This genetic association study used the bioinformatic tools Coloc and transcriptome-wide association study (TWAS) to integrate PD case-control GWAS data published in 2017 with expression data (from Braineac, the Genotype-Tissue Expression [GTEx], and CommonMind) and methylation data (derived from UK Parkinson brain samples) to uncover putative gene expression and splicing mechanisms associated with PD GWAS signals. Candidate genes were further characterized using cell-type specificity, weighted gene coexpression networks, and weighted protein-protein interaction networks.
Main Outcomes and Measures It was hypothesized a priori that some genes underlying PD loci would alter PD risk through changes to expression, splicing, or methylation. Candidate genes are presented whose change in expression, splicing, or methylation are associated with risk of PD as well as the functional pathways and cell types in which these genes have an important role.
Results Gene-level analysis of expression revealed 5 genes (WDR6 [OMIM 606031], CD38 [OMIM 107270], GPNMB [OMIM 604368], RAB29 [OMIM 603949], and TMEM163 [OMIM 618978]) that replicated using both Coloc and TWAS analyses in both the GTEx and Braineac expression data sets. A further 6 genes (ZRANB3 [OMIM 615655], PCGF3 [OMIM 617543], NEK1 [OMIM 604588], NUPL2 [NCBI 11097], GALC [OMIM 606890], and CTSB [OMIM 116810]) showed evidence of disease-associated splicing effects. Cell-type specificity analysis revealed that gene expression was overall more prevalent in glial cell types compared with neurons. The weighted gene coexpression performed on the GTEx data set showed that NUPL2 is a key gene in 3 modules implicated in catabolic processes associated with protein ubiquitination and in the ubiquitin-dependent protein catabolic process in the nucleus accumbens, caudate, and putamen. TMEM163 and ZRANB3 were both important in modules in the frontal cortex and caudate, respectively, indicating regulation of signaling and cell communication. Protein interactor analysis and simulations using random networks demonstrated that the candidate genes interact significantly more with known mendelian PD and parkinsonism proteins than would be expected by chance.
Conclusions and Relevance Together, these results suggest that several candidate genes and pathways are associated with the findings observed in PD GWAS studies
A single-nucleotide polymorphism tagging set for human drug metabolism and transport
Interindividual variability in drug response, ranging from no therapeutic benefit to life-threatening adverse reactions, is influenced by variation in genes that control the absorption, distribution, metabolism and excretion of drugs(1). We genotyped 904 single-nucleotide polymorphisms (SNPs) from 55 such genes in two population samples (European and Japanese) and identified a set of tagging SNPs that represents the common variation in these genes, both known and unknown. Extensive empirical evaluations, including a direct assessment of association with candidate functional SNPs in a new, larger population sample, validated the performance of these tagging SNPs and confirmed their utility for linkage-disequilibrium mapping in pharmacogenetics. The analyses also suggest that rare variation is not amenable to tagging strategies