1,576 research outputs found

    Network-based methods for biological data integration in precision medicine

    Full text link
    [eng] The vast and continuously increasing volume of available biomedical data produced during the last decades opens new opportunities for large-scale modeling of disease biology, facilitating a more comprehensive and integrative understanding of its processes. Nevertheless, this type of modelling requires highly efficient computational systems capable of dealing with such levels of data volumes. Computational approximations commonly used in machine learning and data analysis, namely dimensionality reduction and network-based approaches, have been developed with the goal of effectively integrating biomedical data. Among these methods, network-based machine learning stands out due to its major advantage in terms of biomedical interpretability. These methodologies provide a highly intuitive framework for the integration and modelling of biological processes. This PhD thesis aims to explore the potential of integration of complementary available biomedical knowledge with patient-specific data to provide novel computational approaches to solve biomedical scenarios characterized by data scarcity. The primary focus is on studying how high-order graph analysis (i.e., community detection in multiplex and multilayer networks) may help elucidate the interplay of different types of data in contexts where statistical power is heavily impacted by small sample sizes, such as rare diseases and precision oncology. The central focus of this thesis is to illustrate how network biology, among the several data integration approaches with the potential to achieve this task, can play a pivotal role in addressing this challenge provided its advantages in molecular interpretability. Through its insights and methodologies, it introduces how network biology, and in particular, models based on multilayer networks, facilitates bringing the vision of precision medicine to these complex scenarios, providing a natural approach for the discovery of new biomedical relationships that overcomes the difficulties for the study of cohorts presenting limited sample sizes (data-scarce scenarios). Delving into the potential of current artificial intelligence (AI) and network biology applications to address data granularity issues in the precision medicine field, this PhD thesis presents pivotal research works, based on multilayer networks, for the analysis of two rare disease scenarios with specific data granularities, effectively overcoming the classical constraints hindering rare disease and precision oncology research. The first research article presents a personalized medicine study of the molecular determinants of severity in congenital myasthenic syndromes (CMS), a group of rare disorders of the neuromuscular junction (NMJ). The analysis of severity in rare diseases, despite its importance, is typically neglected due to data availability. In this study, modelling of biomedical knowledge via multilayer networks allowed understanding the functional implications of individual mutations in the cohort under study, as well as their relationships with the causal mutations of the disease and the different levels of severity observed. Moreover, the study presents experimental evidence of the role of a previously unsuspected gene in NMJ activity, validating the hypothetical role predicted using the newly introduced methodologies. The second research article focuses on the applicability of multilayer networks for gene priorization. Enhancing concepts for the analysis of different data granularities firstly introduced in the previous article, the presented research provides a methodology based on the persistency of network community structures in a range of modularity resolution, effectively providing a new framework for gene priorization for patient stratification. In summary, this PhD thesis presents major advances on the use of multilayer network-based approaches for the application of precision medicine to data-scarce scenarios, exploring the potential of integrating extensive available biomedical knowledge with patient-specific data

    Unraveling the transcriptional Cis-regulatory code

    Get PDF
    It is nowadays accepted that eukaryotic complexity is not dictated by the number of protein-coding genes of the genome, but rather achieved through the combinatorics of gene expression programs. Distinct aspects of the expression pattern of a gene are mediated by discrete regulatory sequences, known as cis-regulatory elements. The work described in this thesis was aimed at developing computational and statistical methods to guide the search and characterization of novel cis-regulatory elements

    Exploring missing heritability in neurodevelopmental disorders:Learning from regulatory elements

    Get PDF

    Exploring missing heritability in neurodevelopmental disorders:Learning from regulatory elements

    Get PDF
    In this thesis, I aimed to solve part of the missing heritability in neurodevelopmental disorders, using computational approaches. Next to the investigations of a novel epilepsy syndrome and investigations aiming to elucidate the regulation of the gene involved, I investigated and prioritized genomic sequences that have implications in gene regulation during the developmental stages of human brain, with the goal to create an atlas of high confidence non-coding regulatory elements that future studies can assess for genetic variants in genetically unexplained individuals suffering from neurodevelopmental disorders that are of suspected genetic origin

    RESPONSE AND MOLECULAR CONTROL OF CD8 T CELLS DURING INFECTION AND CANCER

    Get PDF
    CD8 T cells are potent immune effector cells capable of vast clonal expansion and clearance of infected or cancerous cells. After control of the pathogenic insult, CD8 T cells develop into quiescent, long-lived memory populations that are poised to mediate rapid protection upon reencounter with cognate antigen. These properties make control of CD8 T cell responses a highly desirable outcome of vaccine strategies and immunotherapy. Therefore, understanding how the effector function and memory differentiation of CD8 T cells are controlled at a molecular level is of great importance. In the context of infection with gammaherpesviruses (γHV), which form a latent infection that persists for the life span of the host, CD8 T cells play a vital role in control of γHV associated lymphomagenesis. The following studies utilize murine gammaherpesvirus (MHV)-68 and a novel model of γHV-associated B cell lymphoma, EM61 to dissect the mechanisms of CD8 T cell mediated control of γHV associated lymphomagenesis. These studies indicate γHV-specific CD8 T cells control EM61 through mechanisms that partially overlap with those used to control viral replication, however, we note important differences as well. We additionally describe γHV-specific, tissue-resident, memory CD8 T cells (TRM) that form after infection with MHV-68. In the absence of CD4 T cell help, which causes reactivation of γHV during latency, the γHV-specific TRM compartment exhibits changes that are distinct from those observed in the context of acute viral infection. Additional work focused on the molecular control of CD8 T cells by the BTB-ZF family transcription factor (TF), Zbtb20, which restricts CD8 T cell memory differentiation. Using single cell techniques, we identify programs of transcriptional and epigenetic regulation associated with memory CD8 T cell differentiation that underly enhanced memory cell formation in the absence of Zbtb20. Furthermore, using a sensitive technique to interrogate Zbtb20-DNA binding, we describe DNA motifs and genomic annotations from the direct genomic targets of Zbtb20 in CD8 T cells. Together, this work provides new knowledge relevant to the response and control of CD8 T cells to infection and cancer

    Studies on genetic and epigenetic regulation of gene expression dynamics

    Get PDF
    The information required to build an organism is contained in its genome and the first biochemical process that activates the genetic information stored in DNA is transcription. Cell type specific gene expression shapes cellular functional diversity and dysregulation of transcription is a central tenet of human disease. Therefore, understanding transcriptional regulation is central to understanding biology in health and disease. Transcription is a dynamic process, occurring in discrete bursts of activity that can be characterized by two kinetic parameters; burst frequency describing how often genes burst and burst size describing how many transcripts are generated in each burst. Genes are under strict regulatory control by distinct sequences in the genome as well as epigenetic modifications. To properly study how genetic and epigenetic factors affect transcription, it needs to be treated as the dynamic cellular process it is. In this thesis, I present the development of methods that allow identification of newly induced gene expression over short timescales, as well as inference of kinetic parameters describing how frequently genes burst and how many transcripts each burst give rise to. The work is presented through four papers: In paper I, I describe the development of a novel method for profiling newly transcribed RNA molecules. We use this method to show that therapeutic compounds affecting different epigenetic enzymes elicit distinct, compound specific responses mediated by different sets of transcription factors already after one hour of treatment that can only be detected when measuring newly transcribed RNA. The goal of paper II is to determine how genetic variation shapes transcriptional bursting. To this end, we infer transcriptome-wide burst kinetics parameters from genetically distinct donors and find variation that selectively affects burst sizes and frequencies. Paper III describes a method for inferring transcriptional kinetics transcriptome-wide using single-cell RNA-sequencing. We use this method to describe how the regulation of transcriptional bursting is encoded in the genome. Our findings show that gene specific burst sizes are dependent on core promoter architecture and that enhancers affect burst frequencies. Furthermore, cell type specific differential gene expression is regulated by cell type specific burst frequencies. Lastly, Paper IV shows how transcription shapes cell types. We collect data on cellular morphologies, electrophysiological characteristics, and measure gene expression in the same neurons collected from the mouse motor cortex. Our findings show that cells belonging to the same, distinct transcriptomic families have distinct and non-overlapping morpho-electric characteristics. Within families, there is continuous and correlated variation in all modalities, challenging the notion of cell types as discrete entities
    corecore