193 research outputs found

    From AI-driven synthetic biology to prediction of molecular phenotypes

    Get PDF

    Transcriptional regulation and steady-state modeling of metabolic networks

    Get PDF
    Biologiske systemer er karakteriseret ved en høj grad af kompleksitet, hvori de individuelle komponenter (f.eks. proteiner) er indbyrdes forbundet på en måde, der fører til en opførsel, der er vanskelig at forstå i detaljer. Udredning af systemets kompleksitet kræver i det mindste svar på følgende tre spørgsmål: hvad er komponenterne af systemerne, hvordan er de forskellige komponenter sammenkoblet, og hvordan udfører disse netværk de funktioner, der resulterer i systemernes adfærd? Moderne analytiske teknologier giver os mulighed for at optrævle de bestanddele og interaktioner der findes i et givet system, men det tredje spørgsmål er den ultimative udfordring for systembiologi. Nærværende afhandling behandler dette spørgsmål systematisk i forbindelse med metaboliske netværk, som velsagtens er de mest velbeskrevne biologiske netværk hvad angår komponenter og samspillet mellem dem. Desuden er der stor interesse for at forstå og manipulere cellestofskiftet ud fra såvel sundhedsmæssige som bioteknologiske perspektiver. Fundamentalt forskellige biologiske spørgsmål undersøges i forskellige centrale kapitler i afhandlingen, selv om de alle er forbundet af det fælles tema omkring, hvordan det cellulære stofskifte fungerer. De tre vigtigste emner, der behandles, er: i) Transkriptionel regulering af metabolit-koncentrationer, ii) transkriptionel dys-regulering af skeletmuskulaturens stofskifte i type-2 diabetes, og iii) metaboliske interaktioner i mikrobielle økosystemer. Det overordnede mål er at opnå ny forståelse bag de operationelle principper for metaboliske netværk.Cellers reaktioner på forstyrrelser i vækstvilkår og genetiske/epigenetiske ændringer styres i høj grad gennem transkription, som er en af de grundlæggende mekanismer for cellulær regulering. Et vigtigt spørgsmål er, i hvilket omfang genekspression kan forklare metaboliske fænotyper; med andre ord, hvor godt kan ændringer i metabolitkoncentrationer forklares med ændringer i mængderne af mRNA kodende for de ansvarlige enzymer? Forsøg på at forudsige ændringer i metabolomet ud fra genekspressionsdata har hidtil ikke ladet sig gøre. Her udfordrer jeg dette spørgsmål ved at foreslå en mekanistisk forklaring af samspillet mellem metabolitkoncentrationer, transkripter og flux baseret på Michaelis-Menten kinetik på netværks-skala. Dette arbejde viser, at i steady-state systemer er ændringer i intracellulære metabolit-koncentrationer forbundet med ændringer i genekspression af både reaktioner, der producerer, og reaktioner, der forbruger en bestemt metabolit. I modsætning til tidligere tænkning tyder analyse af en stor samling af genekspressionsdata endvidere på, at transkriptionel regulering ved metaboliske forgreningspunkter er meget plastisk, og i flere tilfælde synes den selektive fordel ved reguleringen at være metabolit-orienteret snarere end pathway-orienteret. Undersøgelsen giver således et fundamentalt og nyt syn på metabolisk netværksregulering i Saccharomyces cerevisiae.Metabolisme er et i høj grad bevaret system på tværs af hele biologien. I dag er stofskifte blevet et centralt punkt i diagnosticering og behandling af sygdomme såsom diabetes og kræft. Type 2-diabetes mellitus er en kompleks metabolisk sygdom, der er anerkendt som en af de største trusler mod menneskers sundhed i det 21. århundrede. Nylige undersøgelser af genekspressionsniveauer i humane vævsprøver har vist, at flere metaboliske veje er dysreguleret i diabetes og hos personer med risiko for diabetes; hvilke af disse veje der er primære og/eller centrale for patogenesen, er fortsat et centralt spørgsmål. Cellulære metaboliske netværk er meget tæt forbundne og ofte stramt regulerede; eventuelle forstyrrelser ved et enkelt forbindelsespunkt kan således hurtigt udbrede sig til resten af netværket. En sådan kompleksitet udgør en betydelig udfordring i at indkredse de vigtigste molekylære mekanismer og kendetegn, der er forbundet med insulinresistens og type 2 diabetes. Det foreliggende arbejde løser dette problem ved at bruge en metode, der integrerer genekspressionsdata med det humane cellulære metaboliske netværk. Denne fremgangsmåde demonstreres ved analyse af to datasæt fra skeletmusklers genekspression. Den foreslåede metode identificerede transkriptionsfaktorer og metabolitter, der udgør potentielle mål for farmaka og fremtidig klinisk diagnose for type 2-diabetes og forringet glukosemetabolisme. I en bredere sammenhæng frembyder undersøgelsen en ramme for analyse af genekspression-data indsamlet ved komplekse heterogene sygdomme, genetiske og miljømæssige perturbationer, der afspejles i og/eller er medieret via ændringer i stofskiftet.I naturen eksisterer mikroorganismer normalt ikke som rene kulturer, men udvikler sig og sameksisterer med andre arter. Mikrobielle samfund har en bred vifte af mulige anvendelser, herunder behandling af metaboliske sygdomme og bioteknologi. Eksempelvis kan mikrobielle konsortier bestående af forskellige bakterier og svampe udføre biologisk nedbrydning bedre end rene kulturer, hvilket gør dem attraktive at udforske. Det er almindeligt antaget, at ernæring spiller en afgørende rolle i udformningen af mikrobielle samfund, og indbyrdes udveksling og udnyttelse af metabolitter kan give flere fordele for samfundet som helhed. For eksempel kan en mere effektiv og fuldstændig anvendelse af de tilgængelige næringsstoffer, eller en forbedret evne til at tilpasse sig skiftende ernæringsforhold, føre til forbedret overlevelse af individerne. Det tredje emne i denneafhandling undersøger de metaboliske interaktioners rolle i blandede mikrobielle samfund. Formålet med undersøgelsen er at identificere de egenskaber ved metabolismen, der er bestemmende for strukturerne af de blandede samfund. Analysen er baseret på et globalt metagenomisk datasæt, og metaboliske modeller i genom-skala pegede på, at arter inden for sameksisterende samfund har et større potentiale for metabolisk samarbejde i forhold til tilfældigt sammensatte samfund. Dette arbejde førte til en ny metode (kaldet species metabolic coupling analysis) for at studere metaboliskinteraktion og indbyrdes afhængighed inden for mikrobielle samfund. Metoden har en vifte af konkrete anvendelser, herunder undersøgelse af metaboliske interaktioner i menneskets mikrobiom, værtspatogene interaktioner og udvikling af stabile mikrobielle samfund.Samlet set bidrager dette arbejde med nye indsigter, værktøjer og metoder til at studere hvordan cellulært stofskifte fungerer.Biological systems are characterized by a high degree of complexity wherein the individual components (e.g. proteins) are inter-linked in a way that leads to emergent behaviors that are difficult to decipher. Uncovering system complexity requires, at least, answers to the following three questions: what are the components of the systems, how are the different components interconnected and how do these networks perform the functions that make the resulting system behavior? Modern analytical technologies allow us to unravel the constituents and interactions happening in a given system; however, the third question is the ultimate challenge for systems biology. The work of this thesis systematically addresses this question in the context of metabolic networks, which are arguably the most well characterized cellular networks in terms of their constituting components and interactions among them. Furthermore, there is large interest in understanding and manipulating cellular metabolism from health as well as biotechnological perspectives. Fundamentally different biological questions are investigated in different core chapters of the thesis, though all are linked by the common thread of the functioning of cellular metabolism. The three main topics addressed are: i) transcriptional regulation of metabolite concentration, ii) transcriptional dys-regulation of skeletal muscle metabolism in type 2 diabetes, and iii) metabolic interactions in microbial ecosystems. The overall objective is to obtain novel understanding underlying the operating principles of metabolic networks. Cellular responses to environmental perturbations and genetic/epigenetic modifications are to a large extent controlled through transcription, which is one of the fundamental mechanism/means of cellular regulation. An important question is to what extent gene expression can explain metabolic phenotype, in other words, how well changes in metabolite concentrations can be explained by the changes in related enzyme-coding transcripts? Attempts to predict changes in the metabolome from gene expression data have so far remained unsolved. Here, I challenge this question by proposing a mechanistic explanation of the interplay between metabolite concentrations, transcripts and fluxes based on Michaelis-Menten kinetics at the network-scale. The work demonstrates that in steadystate systems, changes of intracellular metabolites concentrations are linked with the changes in gene expression of both reactions that produce and reactions that consume a given metabolite. Analysis of a large compendium of gene expression data further suggested that, contrary to previous thinking, transcriptional regulation at metabolic branch points is highly plastic and, in several cases, the objective of the regulation appears to be metabolite-oriented as opposed to pathway-oriented. The study thus provides a fundamental and novel view of metabolic network regulation in Saccharomyces cerevisiae. Metabolism is a conserved system across all domains of life. Nowadays, metabolism has become a focal point in diagnosing and treating diseases such as diabetes and cancer. Type 2 diabetes mellitus is a complex metabolic disease which is recognized as one of the largest threats to human health in the 21st century. Recent studies of gene expression levels in human tissue samples have indicated that multiple metabolic pathways are dys-regulated in diabetes and in individuals at risk for diabetes; which of these are primary, or central to disease pathogenesis, remains a key question. Cellular metabolic networks are highly interconnected and often tightly regulated; any perturbations at a single node can thus rapidly diffuse to the rest of the network. Such complexity presents a considerable challenge in pinpointing key molecular mechanisms and signatures associated with insulin resistance and type 2 diabetes. The present work addresses this problem by using a methodology that integrates gene expression data with the human cellular metabolic network. The approach is demonstrated by analysis of two skeletal muscle gene expression datasets. The proposed methodology identified transcription factors and metabolites that represent potential targets for therapeutic agents and future clinical diagnostics for type 2 diabetes and impaired glucose metabolism. In a broader context, the study provides a framework for analysis of gene expression datasets from complex heterogeneous diseases, genetic, and environmental perturbations that are reflected in and/or mediated through changes in metabolism.In nature, microorganisms do not exist as pure cultures, but evolve and co-exist with other species. Microbial communities have a variety of potential applications, including metabolic disease therapies and biotechnology. For example, microbial consortia consisting of various bacteria and fungi are known to exhibit a biodegradation performance superior to pure cultures, making them attractive research targets. It is believed that nutrition plays a crucial role in shaping microbial communities. Interspecies metabolite cross-feeding can confer several advantages to the community as a whole. For example, more efficient and complete use of available nutrients, or increased ability to survive under diverse/changing nutrition availability potentially induces fitness of individuals. The third topic of this thesis investigates the role of metabolic interaction in co-occurring microbial communities. The study aims to identify metabolic properties that shape the community structures. The analysis based on a global metagenomic dataset and genome-scale metabolic models suggested that species within coexisting communities have higher potential of metabolic cooperation compared to random controls. This work yielded a novel methodology (termed species metabolic coupling analysis) for studying metabolic interaction and interdependencies within microbial communities. Species metabolic coupling analysis has a spectrum of applications to real-world problems, including investigation of metabolic interactions within the human microbiome, host -pathogen interactions and development of stable microbial communities. Overall, this work contributes with novel insights, tools and methodologies to study the operation of cellular metabolism

    Parallel Factor Analysis Enables Quantification and Identification of Highly Convolved Data-Independent-Acquired Protein Spectra

    Get PDF
    The latest high-throughput mass spectrometry-based technologies can record virtually all molecules from complex biological samples, providing a holistic picture of proteomes in cells and tissues and enabling an evaluation of the overall status of a person\u27s health. However, current best practices are still only scratching the surface of the wealth of available information obtained from the massive proteome datasets, and efficient novel data-driven strategies are needed. Powered by advances in GPU hardware and open-source machine-learning frameworks, we developed a data-driven approach, CANDIA, which disassembles highly complex proteomics data into the elementary molecular signatures of the proteins in biological samples. Our work provides a performant and adaptable solution that complements existing mass spectrometry techniques. As the central mathematical methods are generic, other scientific fields that are dealing with highly convolved datasets will benefit from this work

    Toward learning the principles of plant gene regulation

    Get PDF
    Advanced machine learning (ML) algorithms produce highly accurate models of gene expression, uncovering novel regulatory features in nucleotide sequences involving multiple cis-regulatory regions across whole genes and structural properties. These broaden our understanding of gene regulation and point to new principles to test and adopt in the field of plant science

    Contribution of Network Connectivity in Determining the Relationship between Gene Expression and Metabolite Concentration Changes

    Get PDF
    One of the primary mechanisms through which a cell exerts control over its metabolic state is by modulating expression levels of its enzyme-coding genes. However, the changes at the level of enzyme expression allow only indirect control over metabolite levels, for two main reasons. First, at the level of individual reactions, metabolite levels are non-linearly dependent on enzyme abundances as per the reaction kinetics mechanisms. Secondly, specific metabolite pools are tightly interlinked with the rest of the metabolic network through their production and consumption reactions. While the role of reaction kinetics in metabolite concentration control is well studied at the level of individual reactions, the contribution of network connectivity has remained relatively unclear. Here we report a modeling framework that integrates both reaction kinetics and network connectivity constraints for describing the interplay between metabolite concentrations and mRNA levels. We used this framework to investigate correlations between the gene expression and the metabolite concentration changes in Saccharomyces cerevisiae during its metabolic cycle, as well as in response to three fundamentally different biological perturbations, namely gene knockout, nutrient shock and nutrient change. While the kinetic constraints applied at the level of individual reactions were found to be poor descriptors of the mRNA-metabolite relationship, their use in the context of the network enabled us to correlate changes in the expression of enzyme-coding genes to the alterations in metabolite levels. Our results highlight the key contribution of metabolic network connectivity in mediating cellular control over metabolite levels, and have implications towards bridging the gap between genotype and metabolic phenotype

    metaGEM: reconstruction of genome scale metabolic models directly from metagenomes

    Get PDF
    Metagenomic analyses of microbial communities have revealed a large degree of interspecies and intraspecies genetic diversity through the reconstruction of metagenome assembled genomes (MAGs). Yet, metabolic modeling efforts mainly rely on reference genomes as the starting point for reconstruction and simulation of genome scale metabolic models (GEMs), neglecting the immense intra- and inter-species diversity present in microbial communities. Here, we present metaGEM (https://github.com/franciscozo rrilla/metaGEM), an end-to-end pipeline enabling metabolic modeling of multi-species communities directly from metagenomes. The pipeline automates all steps from the extraction of context-specific prokaryotic GEMs from MAGs to community level flux balance analysis (FBA) simulations. To demonstrate the capabilities of metaGEM, we analyzed 483 samples spanning lab culture, human gut, plant-associated, soil, and ocean metagenomes, reconstructing over 14,000 GEMs. We show that GEMs reconstructed from metagenomes have fully represented metabolism comparable to isolated genomes. We demonstrate that metagenomic GEMs capture intraspecies metabolic diversity and identify potential differences in the progression of type 2 diabetes at the level of gut bacterial metabolic exchanges. Overall, metaGEM enables FBA-ready metabolic model reconstruction directly from metagenomes, provides a resource of metabolic models, and showcases community-level modeling of microbiomes associated with disease conditions allowing generation of mechanistic hypotheses

    Plastic-Degrading Potential across the Global Microbiome Correlates with Recent Pollution Trends

    Get PDF
    Biodegradation is a plausible route toward sustainable management of the millions of tons of plastic waste that have accumulated in terrestrial and marine environments. However, the global diversity of plastic-degrading enzymes remains poorly understood. Taking advantage of global environmental DNA sampling projects, here we constructed hidden Markov models from experimentally verified enzymes and mined ocean and soil metagenomes to assess the global potential of microorganisms to degrade plastics. By controlling for false positives using gut microbiome data, we compiled a catalogue of over 30,000 nonredundant enzyme homologues with the potential to degrade 10 different plastic types. While differences between the ocean and soil microbiomes likely reflect the base compositions of these environments, we find that ocean enzyme abundance increases with depth as a response to plastic pollution and not merely taxonomic composition. By obtaining further pollution measurements, we observed that the abundance of the uncovered enzymes in both ocean and soil habitats significantly correlates with marine and country-specific plastic pollution trends. Our study thus uncovers the earth microbiome\u27s potential to degrade plastics, providing evidence of a measurable effect of plastic pollution on the global microbial ecology as well as a useful resource for further applied research. IMPORTANCE Utilization of synthetic biology approaches to enhance current plastic degradation processes is of crucial importance, as natural plastic degradation processes are very slow. For instance, the predicted lifetime of a polyethylene terephthalate (PET) bottle under ambient conditions ranges from 16 to 48 years. Moreover, although there is still unexplored diversity in microbial communities, synergistic degradation of plastics by microorganisms holds great potential to revolutionize the management of global plastic waste. To this end, the methods and data on novel plastic-degrading enzymes presented here can help researchers by (i) providing further information about the taxonomic diversity of such enzymes as well as understanding of the mechanisms and steps involved in the biological breakdown of plastics, (ii) pointing toward the areas with increased availability of novel enzymes, and (iii) giving a basis for further application in industrial plastic waste biodegradation. Importantly, our findings provide evidence of a measurable effect of plastic pollution on the global microbial ecology

    Learning the Regulatory Code of Gene Expression

    Get PDF
    Data-driven machine learning is the method of choice for predicting molecular phenotypes from nucleotide sequence, modeling gene expression events including protein-DNA binding, chromatin states as well as mRNA and protein levels. Deep neural networks automatically learn informative sequence representations and interpreting them enables us to improve our understanding of the regulatory code governing gene expression. Here, we review the latest developments that apply shallow or deep learning to quantify molecular phenotypes and decode the cis-regulatory grammar from prokaryotic and eukaryotic sequencing data. Our approach is to build from the ground up, first focusing on the initiating protein-DNA interactions, then specific coding and non-coding regions, and finally on advances that combine multiple parts of the gene and mRNA regulatory structures, achieving unprecedented performance. We thus provide a quantitative view of gene expression regulation from nucleotide sequence, concluding with an information-centric overview of the central dogma of molecular biology

    Learning the Regulatory Code of Gene Expression

    Get PDF
    Data-driven machine learning is the method of choice for predicting molecular phenotypes from nucleotide sequence, modeling gene expression events including protein-DNA binding, chromatin states as well as mRNA and protein levels. Deep neural networks automatically learn informative sequence representations and interpreting them enables us to improve our understanding of the regulatory code governing gene expression. Here, we review the latest developments that apply shallow or deep learning to quantify molecular phenotypes and decode the cis-regulatory grammar from prokaryotic and eukaryotic sequencing data. Our approach is to build from the ground up, first focusing on the initiating protein-DNA interactions, then specific coding and non-coding regions, and finally on advances that combine multiple parts of the gene and mRNA regulatory structures, achieving unprecedented performance. We thus provide a quantitative view of gene expression regulation from nucleotide sequence, concluding with an information-centric overview of the central dogma of molecular biology
    • …
    corecore