Search CORE

Lancaster E-Prints

A Bayesian method to incorporate hundreds of functional characteristics with association evidence to improve variant prioritization

Author: Barnes Michael R.
Gagliano Sarah A.
Knight Jo
Weale Michael E.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

The increasing quantity and quality of functional genomic information motivate the assessment and integration of these data with association data, including data originating from genome-wide association studies (GWAS). We used previously described GWAS signals ("hits") to train a regularized logistic model in order to predict SNP causality on the basis of a large multivariate functional dataset. We show how this model can be used to derive Bayes factors for integrating functional and association data into a combined Bayesian analysis. Functional characteristics were obtained from the Encyclopedia of DNA Elements (ENCODE), from published expression quantitative trait loci (eQTL), and from other sources of genome-wide characteristics. We trained the model using all GWAS signals combined, and also using phenotype specific signals for autoimmune, brain-related, cancer, and cardiovascular disorders. The non-phenotype specific and the autoimmune GWAS signals gave the most reliable results. We found SNPs with higher probabilities of causality from functional characteristics showed an enrichment of more significant p-values compared to all GWAS SNPs in three large GWAS studies of complex traits. We investigated the ability of our Bayesian method to improve the identification of true causal signals in a psoriasis GWAS dataset and found that combining functional data with association data improves the ability to prioritise novel hits. We used the predictions from the penalized logistic regression model to calculate Bayes factors relating to functional characteristics and supply these online alongside resources to integrate these data with association data

Directory of Open Access Journals

Lancaster E-Prints

University of Huddersfield Repository

FigShare

Salivary amylase gene copy number: Have humans adapted to high starch diets?

Author: Caldwell Elizabeth F.
Thomas Mark G.
Von Crammon-Taubadel Noreen
Weale Michael E.
Publication venue
Publication date
Field of study

Assessing models for genetic prediction of complex traits:a comparison of visualization and quantitative methods

Author: Gagliano Sarah A.
Knight Jo
Paterson Andrew D.
Weale Michael E.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/05/2015
Field of study

BACKGROUND: In silico models have recently been created in order to predict which genetic variants are more likely to contribute to the risk of a complex trait given their functional characteristics. However, there has been no comprehensive review as to which type of predictive accuracy measures and data visualization techniques are most useful for assessing these models. METHODS: We assessed the performance of the models for predicting risk using various methodologies, some of which include: receiver operating characteristic (ROC) curves, histograms of classification probability, and the novel use of the quantile-quantile plot. These measures have variable interpretability depending on factors such as whether the dataset is balanced in terms of numbers of genetic variants classified as risk variants versus those that are not. RESULTS: We conclude that the area under the curve (AUC) is a suitable starting place, and for models with similar AUCs, violin plots are particularly useful for examining the distribution of the risk scores

Springer - Publisher Connector

Lancaster E-Prints

Delta-Centralization Fails to Control for Population Stratification in Genetic Association Studies

Author: Cathryn M. Lewis
Michael E. Weale
Tony Dadd
Publication venue: 'S. Karger AG'
Publication date
Field of study

Little genetic differentiation as assessed by uniparental markers in the presence of substantial language variation in peoples of the Cross River region of Nigeria

Author: Bradman Neil
Connell Bruce A
Mendell Nancy R
Plaster Christopher A
Pour Naser Ansari
Powell Adam
Thomas Mark G
Veeramah Krishna R
Weale Michael E
Zeitlyn David
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The Cross River region in Nigeria is an extremely diverse area linguistically with over 60 distinct languages still spoken today. It is also a region of great historical importance, being a) adjacent to the likely homeland from which Bantu-speaking people migrated across most of sub-Saharan Africa 3000-5000 years ago and b) the location of Calabar, one of the largest centres during the Atlantic slave trade. Over 1000 DNA samples from 24 clans representing speakers of the six most prominent languages in the region were collected and typed for Y-chromosome (SNPs and microsatellites) and mtDNA markers (Hypervariable Segment 1) in order to examine whether there has been substantial gene flow between groups speaking different languages in the region. In addition the Cross River region was analysed in the context of a larger geographical scale by comparison to bordering Igbo speaking groups as well as neighbouring Cameroon populations and more distant Ghanaian communities. Results The Cross River region was shown to be extremely homogenous for both Y-chromosome and mtDNA markers with language spoken having no noticeable effect on the genetic structure of the region, consistent with estimates of inter-language gene flow of 10% per generation based on sociological data. However the groups in the region could clearly be differentiated from others in Cameroon and Ghana (and to a lesser extent Igbo populations). Significant correlations between genetic distance and both geographic and linguistic distance were observed at this larger scale. Conclusions Previous studies have found significant correlations between genetic variation and language in Africa over large geographic distances, often across language families. However the broad sampling strategies of these datasets have limited their utility for understanding the relationship within language families. This is the first study to show that at very fine geographic/linguistic scales language differences can be maintained in the presence of substantial gene flow over an extended period of time and demonstrates the value of dense sampling strategies and having DNA of known and detailed provenance, a practice that is generally rare when investigating sub-Saharan African demographic processes using genetic data.</p

Springer - Publisher Connector

Directory of Open Access Journals

eScholarship - University of California

Oxford University Research Archive

MPG.PuRe

Analysis of subcellular RNA fractions demonstrates significant genetic regulation of gene expression in human brain post-transcriptionally

Author: Botía Juan A
D'Sa Karishma
Guelfi Sebastian
Hardy John
Reynolds Regina H
Ryten Mina
Small Kerrin S
Taliun Sarah A Gagliano
Vandrovcova Jana
Weale Michael E
Zhang David
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/08/2023
Field of study

Gaining insight into the genetic regulation of gene expression in human brain is key to the interpretation of genome-wide association studies for major neurological and neuropsychiatric diseases. Expression quantitative trait loci (eQTL) analyses have largely been used to achieve this, providing valuable insights into the genetic regulation of steady-state RNA in human brain, but not distinguishing between molecular processes regulating transcription and stability. RNA quantification within cellular fractions can disentangle these processes in cell types and tissues which are challenging to model in vitro. We investigated the underlying molecular processes driving the genetic regulation of gene expression specific to a cellular fraction using allele-specific expression (ASE). Applying ASE analysis to genomic and transcriptomic data from paired nuclear and cytoplasmic fractions of anterior prefrontal cortex, cerebellar cortex and putamen tissues from 4 post-mortem neuropathologically-confirmed control human brains, we demonstrate that a significant proportion of genetic regulation of gene expression occurs post-transcriptionally in the cytoplasm, with genes undergoing this form of regulation more likely to be synaptic. These findings have implications for understanding the structure of gene expression regulation in human brain, and importantly the interpretation of rapidly growing single-nucleus brain RNA-sequencing and eQTL datasets, where cytoplasm-specific regulatory events could be missed

Central Archive at the University of Reading

Recommended from our members

Investigating the utility of human embryonic stem cell-derived neurons to model ageing and neurodegenerative disease using whole-genome gene expression and splicing analysis

Author: Chandran Siddharthan
Hardingham Giles E.
Hardy John
Lewis Patrick A
Patani Rickie
Puddifoot Clare A.
Ryten Mina
Smith Colin
Trabzuni Daniah
Walker Robert
Weale Michael
Wyllie David J. A.
Publication venue: 'Wiley'
Publication date: 01/01/2012
Field of study

A major goal in regenerative medicine is the predictable manipulation of human embryonic stem cells (hESCs) to defined cell fates that faithfully represent their somatic counterparts. Directed differentiation of hESCs into neuronal populations has galvanized much interest into their potential application in modelling neurodegenerative disease. However, neurodegenerative diseases are age-related, and therefore establishing the maturational comparability of hESC-derived neural derivatives is critical to generating accurate in vitro model systems. We address this issue by comparing genome-wide, exon-specific expression analyses of pluripotent hESCs, multipotent neural precursor cells and a terminally differentiated enriched neuronal population to expression data from post-mortem foetal and adult human brain samples. We show that hESC-derived neuronal cultures (using a midbrain differentiation protocol as a prototypic example of lineage restriction), while successful in generating physiologically functional neurons, are closer to foetal than adult human brain in terms of molecular maturation. These findings suggest that developmental stage has a more dominant influence on the cellular transcriptome than regional identity. In addition, we demonstrate that developmentally regulated gene splicing is common, and potentially a more sensitive measure of maturational state than gene expression profiling alone. In summary, this study highlights the value of genomic indices in refining and validating optimal cell populations appropriate for modelling ageing and neurodegeneration

Edinburgh Research Explorer

Genetic evidence for a pathogenic role for the vitamin D3 metabolizing enzyme CYP24A1 in multiple sclerosis

Author: Dillman Allissa
Forabosco Paola
Hardy John
Ramasamy Adaikalavan
Ryten Mina
Smith Colin
Sveinbjornsdottir Sigurlaug
Trabzuni Daniah
Walker Robert
Weale Michael E.
Publication venue: 'Elsevier BV'
Publication date: 01/03/2014
Field of study

Background: Multiple sclerosis (MS) is a common disease of the central nervous system and a major cause of disability amongst young adults. Genome-wide association studies have identified many novel susceptibility loci including rs2248359. We hypothesized that genotypes of this locus could increase the risk of MS by regulating expression of neighboring gene, CYP24A1 which encodes the enzyme responsible for initiating degradation of 1,25-dihydroxyvitamin D3. Methods: We investigated this hypothesis using paired gene expression and genotyping data from three independent datasets of neurologically healthy adults of European descent. The UK Brain Expression Consortium (UKBEC) consists of post-mortem samples across 10 brain regions originating from 134 individuals (1231 samples total). The North American Brain Expression Consortium (NABEC) consists of cerebellum and frontal cortex samples from 304 individuals (605 samples total). The brain dataset from Heinzen and colleagues consists of prefrontal cortex samples from 93 individuals. Additionally, we used gene network analysis to analyze UKBEC expression data to understand CYP24A1 function in human brain. Findings: The risk allele, rs2248359-C, is strongly associated with increased expression of CYP24A1 in frontal cortex (p-value=1.45×10−13), but not white matter. This association was replicated using data from NABEC (p-value=7.2×10−6) and Heinzen and colleagues (p-value=1.2×10−4). Network analysis shows a significant enrichment of terms related to immune response in eight out of the 10 brain regions. Interpretation: The known MS risk allele rs2248359-C increases CYP24A1 expression in human brain providing a genetic link between MS and vitamin D metabolism, and predicting that the physiologically active form of vitamin D3 is protective. Vitamin D3's involvement in MS may relate to its immunomodulatory functions in human brain. Finding: Medical Research Council UK; King Faisal Specialist Hospital and Research Centre, Saudi Arabia; Intramural Research Program of the National Institute on Aging, National Institutes of Health, USA

Edinburgh Research Explorer

UnissResearch

Dense sampling of ethnic groups within African countries reveals fine-scale genetic structure and extensive historical admixture

Author: Awah Paschal
Bird Nancy
Bradman Neil
Caldwell Elizabeth F
Connell Bruce
Elamin Mohamed
Fadlelmola Faisal M
Hellenthal Garrett
López Saioa
MacEachern Scott
Matthew Fomine Forka Leypey
Morris Sam
Moñino Yves
Nketsia V Nana Kobina
Näsänen-Gilmore Pieta
Ormond Louise
Thomas Mark G
Veeramah Krishna
Weale Michael E
Zeitlyn David
Publication venue
Publication date: 29/03/2023
Field of study

Previous studies have highlighted how African genomes have been shaped by a complex series of historical events. Despite this, genome-wide data have only been obtained from a small proportion of present-day ethnolinguistic groups. By analyzing new autosomal genetic variation data of 1333 individuals from over 150 ethnic groups from Cameroon, Republic of the Congo, Ghana, Nigeria, and Sudan, we demonstrate a previously underappreciated fine-scale level of genetic structure within these countries, for example, correlating with historical polities in western Cameroon. By comparing genetic variation patterns among populations, we infer that many northern Cameroonian and Sudanese groups share genetic links with multiple geographically disparate populations, likely resulting from long-distance migrations. In Ghana and Nigeria, we infer signatures of intermixing dated to over 2000 years ago, corresponding to reports of environmental transformations possibly related to climate change. We also infer recent intermixing signals in multiple African populations, including Congolese, that likely relate to the expansions of Bantu language-speaking peoples