182 research outputs found

    Inter-individual variation of the human epigenome & applications

    Get PDF

    Computational techniques to interpret the neural code underlying complex cognitive processes

    Get PDF
    Advances in large-scale neural recording technology have significantly improved the capacity to further elucidate the neural code underlying complex cognitive processes. This thesis aimed to investigate two research questions in rodent models. First, what is the role of the hippocampus in memory and specifically what is the underlying neural code that contributes to spatial memory and navigational decision-making. Second, how is social cognition represented in the medial prefrontal cortex at the level of individual neurons. To start, the thesis begins by investigating memory and social cognition in the context of healthy and diseased states that use non-invasive methods (i.e. fMRI and animal behavioural studies). The main body of the thesis then shifts to developing our fundamental understanding of the neural mechanisms underpinning these cognitive processes by applying computational techniques to ana lyse stable large-scale neural recordings. To achieve this, tailored calcium imaging and behaviour preprocessing computational pipelines were developed and optimised for use in social interaction and spatial navigation experimental analysis. In parallel, a review was conducted on methods for multivariate/neural population analysis. A comparison of multiple neural manifold learning (NML) algorithms identified that non linear algorithms such as UMAP are more adaptable across datasets of varying noise and behavioural complexity. Furthermore, the review visualises how NML can be applied to disease states in the brain and introduces the secondary analyses that can be used to enhance or characterise a neural manifold. Lastly, the preprocessing and analytical pipelines were combined to investigate the neural mechanisms in volved in social cognition and spatial memory. The social cognition study explored how neural firing in the medial Prefrontal cortex changed as a function of the social dominance paradigm, the "Tube Test". The univariate analysis identified an ensemble of behavioural-tuned neurons that fire preferentially during specific behaviours such as "pushing" or "retreating" for the animal’s own behaviour and/or the competitor’s behaviour. Furthermore, in dominant animals, the neural population exhibited greater average firing than that of subordinate animals. Next, to investigate spatial memory, a spatial recency task was used, where rats learnt to navigate towards one of three reward locations and then recall the rewarded location of the session. During the task, over 1000 neurons were recorded from the hippocampal CA1 region for five rats over multiple sessions. Multivariate analysis revealed that the sequence of neurons encoding an animal’s spatial position leading up to a rewarded location was also active in the decision period before the animal navigates to the rewarded location. The result posits that prospective replay of neural sequences in the hippocampal CA1 region could provide a mechanism by which decision-making is supported

    Inter-individual variation of the human epigenome & applications

    Get PDF
    Genome-wide association studies (GWAS) have led to the discovery of genetic variants influencing human phenotypes in health and disease. However, almost two decades later, most human traits can still not be accurately predicted from common genetic variants. Moreover, genetic variants discovered via GWAS mostly map to the non-coding genome and have historically resisted interpretation via mechanistic models. Alternatively, the epigenome lies in the cross-roads between genetics and the environment. Thus, there is great excitement towards the mapping of epigenetic inter-individual variation since its study may link environmental factors to human traits that remain unexplained by genetic variants. For instance, the environmental component of the epigenome may serve as a source of biomarkers for accurate, robust and interpretable phenotypic prediction on low-heritability traits that cannot be attained by classical genetic-based models. Additionally, its research may provide mechanisms of action for genetic associations at non-coding regions that mediate their effect via the epigenome. The aim of this thesis was to explore epigenetic inter-individual variation and to mitigate some of the methodological limitations faced towards its future valorisation.Chapter 1 is dedicated to the scope and aims of the thesis. It begins by describing historical milestones and basic concepts in human genetics, statistical genetics, the heritability problem and polygenic risk scores. It then moves towards epigenetics, covering the several dimensions it encompasses. It subsequently focuses on DNA methylation with topics like mitotic stability, epigenetic reprogramming, X-inactivation or imprinting. This is followed by concepts from epigenetic epidemiology such as epigenome-wide association studies (EWAS), epigenetic clocks, Mendelian randomization, methylation risk scores and methylation quantitative trait loci (mQTL). The chapter ends by introducing the aims of the thesis.Chapter 2 focuses on stochastic epigenetic inter-individual variation resulting from processes occurring post-twinning, during embryonic development and early life. Specifically, it describes the discovery and characterisation of hundreds of variably methylated CpGs in the blood of healthy adolescent monozygotic (MZ) twins showing equivalent variation among co-twins and unrelated individuals (evCpGs) that could not be explained only by measurement error on the DNA methylation microarray. DNA methylation levels at evCpGs were shown to be stable short-term but susceptible to aging and epigenetic drift in the long-term. The identified sites were significantly enriched at the clustered protocadherin loci, known for stochastic methylation in neurons in the context of embryonic neurodevelopment. Critically, evCpGs were capable of clustering technical and longitudinal replicates while differentiating young MZ twins. Thus, discovered evCpGs can be considered as a first prototype towards universal epigenetic fingerprint, relevant in the discrimination of MZ twins for forensic purposes, currently impossible with standard DNA profiling. Besides, DNA methylation microarrays are the preferred technology for EWAS and mQTL mapping studies. However, their probe design inherently assumes that the assayed genomic DNA is identical to the reference genome, leading to genetic artifacts whenever this assumption is not fulfilled. Building upon the previous experience analysing microarray data, Chapter 3 covers the development and benchmarking of UMtools, an R-package for the quantification and qualification of genetic artifacts on DNA methylation microarrays based on the unprocessed fluorescence intensity signals. These tools were used to assemble an atlas on genetic artifacts encountered on DNA methylation microarrays, including interactions between artifacts or with X-inactivation, imprinting and tissue-specific regulation. Additionally, to distinguish artifacts from genuine epigenetic variation, a co-methylation-based approach was proposed. Overall, this study revealed that genetic artifacts continue to filter through into the reported literature since current methodologies to address them have overlooked this challenge.Furthermore, EWAS, mQTL and allele-specific methylation (ASM) mapping studies have all been employed to map epigenetic variation but require matching phenotypic/genotypic data and can only map specific components of epigenetic inter-individual variation. Inspired by the previously proposed co-methylation strategy, Chapter 4 describes a novel method to simultaneously map inter-haplotype, inter-cell and inter-individual variation without these requirements. Specifically, binomial likelihood function-based bootstrap hypothesis test for co-methylation within reads (Binokulars) is a randomization test that can identify jointly regulated CpGs (JRCs) from pooled whole genome bisulfite sequencing (WGBS) data by solely relying on joint DNA methylation information available in reads spanning multiple CpGs. Binokulars was tested on pooled WGBS data in whole blood, sperm and combined, and benchmarked against EWAS and ASM. Our comparisons revealed that Binokulars can integrate a wide range of epigenetic phenomena under the same umbrella since it simultaneously discovered regions associated with imprinting, cell type- and tissue-specific regulation, mQTL, ageing or even unknown epigenetic processes. Finally, we verified examples of mQTL and polymorphic imprinting by employing another novel tool, JRC_sorter, to classify regions based on epigenotype models and non-pooled WGBS data in cord blood. In the future, we envision how this cost-effective approach can be applied on larger pools to simultaneously highlight regions of interest in the methylome, a highly relevant task in the light of the post-GWAS era.Moving towards future applications of epigenetic inter-individual variation, Chapters 5 and 6 are dedicated to solving some of methodological issues faced in translational epigenomics.Firstly, due to its simplicity and well-known properties, linear regression is the starting point methodology when performing prediction of a continuous outcome given a set of predictors. However, linear regression is incompatible with missing data, a common phenomenon and a huge threat to the integrity of data analysis in empirical sciences, including (epi)genomics. Chapter 5 describes the development of combinatorial linear models (cmb-lm), an imputation-free, CPU/RAM-efficient and privacy-preserving statistical method for linear regression prediction on datasets with missing values. Cmb-lm provide prediction errors that take into account the pattern of missing values in the incomplete data, even at extreme missingness. As a proof-of-concept, we tested cmb-lm in the context of epigenetic ageing clocks, one of the most popular applications of epigenetic inter-individual variation. Overall, cmb-lm offer a simple and flexible methodology with a wide range of applications that can provide a smooth transition towards the valorisation of linear models in the real world, where missing data is almost inevitable. Beyond microarrays, due to its high accuracy, reliability and sample multiplexing capabilities, massively parallel sequencing (MPS) is currently the preferred methodology of choice to translate prediction models for traits of interests into practice. At the same time, tobacco smoking is a frequent habit sustained by more than 1.3 billion people in 2020 and a leading (and preventable) health risk factor in the modern world. Predicting smoking habits from a persistent biomarker, such as DNA methylation, is not only relevant to account for self-reporting bias in public health and personalized medicine studies, but may also allow broadening forensic DNA phenotyping. Previously, a model to predict whether someone is a current, former, or never smoker had been published based on solely 13 CpGs from the hundreds of thousands included in the DNA methylation microarray. However, a matching lab tool with lower marker throughput, and higher accuracy and sensitivity was missing towards translating the model in practice. Chapter 6 describes the development of an MPS assay and data analysis pipeline to quantify DNA methylation on these 13 smoking-associated biomarkers for the prediction of smoking status. Though our systematic evaluation on DNA standards of known methylation levels revealed marker-specific amplification bias, our novel tool was still able to provide highly accurate and reproducible DNA methylation quantification and smoking habit prediction. Overall, our MPS assay allows the technological transfer of DNA methylation microarray findings and models to practical settings, one step closer towards future applications.Finally, Chapter 7 provides a general discussion on the results and topics discussed across Chapters 2-6. It begins by summarizing the main findings across the thesis, including proposals for follow-up studies. It then covers technical limitations pertaining bisulfite conversion and DNA methylation microarrays, but also more general considerations such as restricted data access. This chapter ends by covering the outlook of this PhD thesis, including topics such as bisulfite-free methods, third-generation sequencing, single-cell methylomics, multi-omics and systems biology.<br/

    Simultaneous Multiparametric and Multidimensional Cardiovascular Magnetic Resonance Imaging

    Get PDF
    No abstract available

    Uncovering the complex genetic architecture of human plasma lipidome using machine learning methods

    Get PDF
    Genetic architecture of plasma lipidome provides insights into regulation of lipid metabolism and related diseases. We applied an unsupervised machine learning method, PGMRA, to discover phenotype-genotype many-to-many relations between genotype and plasma lipidome (phenotype) in order to identify the genetic architecture of plasma lipidome profiled from 1,426 Finnish individuals aged 30-45 years. PGMRA involves biclustering genotype and lipidome data independently followed by their inter-domain integration based on hypergeometric tests of the number of shared individuals. Pathway enrichment analysis was performed on the SNP sets to identify their associated biological processes. We identified 93 statistically significant (hypergeometric p-value \u3c 0.01) lipidome-genotype relations. Genotype biclusters in these 93 relations contained 5977 SNPs across 3164 genes. Twenty nine of the 93 relations contained genotype biclusters with more than 50% unique SNPs and participants, thus representing most distinct subgroups. We identified 30 significantly enriched biological processes among the SNPs involved in 21 of these 29 most distinct genotype-lipidome subgroups through which the identified genetic variants can influence and regulate plasma lipid related metabolism and profiles. This study identified 29 distinct genotype-lipidome subgroups in the studied Finnish population that may have distinct disease trajectories and therefore could be useful in precision medicine research

    Exercise-Induced Hypoalgesia in people with chronic low back pain

    Get PDF
    Chronic low back pain (CLBP) is one of the most prevalent musculoskeletal disorders and a major contributor to disability worldwide. Exercise is recommended in guidelines as a cornerstone of the management of CLBP. One of the manifold benefits of exercise is its influence on endogenous pain modulation. An acute bout of exercise elicits a temporary decrease in pain sensitivity, described as exercise-induced hypoalgesia (EIH). This thesis explores EIH in people with CLBP via a systematic review and observational studies. The systematic review included 17 studies in people with spinal pain. Of those, four studies considered people with CLBP revealing very low quality evidence with conflicting results. EIH was elicited following remote cycling tasks (two studies, fair risk of bias), but EIH was altered following local repetitive lifting tasks (two studies, good/fair risk of bias). The observational studies investigated EIH following three different tasks in participants with and without CLBP and explored the stability of EIH results. Conflicting results from quantitative sensory testing were found for whether EIH is impaired in people with CLBP. EIH was only elicited in asymptomatic participants following a repeated lifting task, but both participants with and without CLBP showed EIH following a lumbar resistance and a brisk walking task. This thesis demonstrates the first evidence of stability of EIH over multiple sessions. However, the interpretation of the results can be challenging as stability was poor and changes in lumbar pressure pain thresholds also occurred after rest only. These findings are important to inform future studies contributing to the elucidation of the complex phenomenon of EIH in people with/without CLBP, specifically as the stability is a prerequisite for future research

    Quantitative MRI in cardiometabolic disease: From conventional cardiac and liver tissue mapping techniques to multi-parametric approaches

    Get PDF
    Cardiometabolic disease refers to the spectrum of chronic conditions that include diabetes, hypertension, atheromatosis, non-alcoholic fatty liver disease, and their long-term impact on cardiovascular health. Histological studies have confirmed several modifications at the tissue level in cardiometabolic disease. Recently, quantitative MR methods have enabled non-invasive myocardial and liver tissue characterization. MR relaxation mapping techniques such as T1, T1ρ, T2 and T2* provide a pixel-by-pixel representation of the corresponding tissue specific relaxation times, which have been shown to correlate with fibrosis, altered tissue perfusion, oedema and iron levels. Proton density fat fraction mapping approaches allow measurement of lipid tissue in the organ of interest. Several studies have demonstrated their utility as early diagnostic biomarkers and their potential to bear prognostic implications. Conventionally, the quantification of these parameters by MRI relies on the acquisition of sequential scans, encoding and mapping only one parameter per scan. However, this methodology is time inefficient and suffers from the confounding effects of the relaxation parameters in each single map, limiting wider clinical and research applications. To address these limitations, several novel approaches have been proposed that encode multiple tissue parameters simultaneously, providing co-registered multiparametric information of the tissues of interest. This review aims to describe the multi-faceted myocardial and hepatic tissue alterations in cardiometabolic disease and to motivate the application of relaxometry and proton-density cardiac and liver tissue mapping techniques. Current approaches in myocardial and liver tissue characterization as well as latest technical developments in multiparametric quantitative MRI are included. Limitations and challenges of these novel approaches, and recommendations to facilitate clinical validation are also discussed

    Improving Diagnostics with Deep Forest Applied to Electronic Health Records

    Get PDF
    An electronic health record (EHR) is a vital high-dimensional part of medical concepts. Discovering implicit correlations in the information of this data set and the research and informative aspects can improve the treatment and management process. The challenge of concern is the data sources’ limitations in finding a stable model to relate medical concepts and use these existing connections. This paper presents Patient Forest, a novel end-to-end approach for learning patient representations from tree-structured data for readmission and mortality prediction tasks. By leveraging statistical features, the proposed model is able to provide an accurate and reliable classifier for predicting readmission and mortality. Experiments on MIMIC-III and eICU datasets demonstrate Patient Forest outperforms existing machine learning models, especially when the training data are limited. Additionally, a qualitative evaluation of Patient Forest is conducted by visualising the learnt representations in 2D space using the t-SNE, which further confirms the effectiveness of the proposed model in learning EHR representations
    • 

    corecore