394 research outputs found

    Network-based methods for biological data integration in precision medicine

    Full text link
    [eng] The vast and continuously increasing volume of available biomedical data produced during the last decades opens new opportunities for large-scale modeling of disease biology, facilitating a more comprehensive and integrative understanding of its processes. Nevertheless, this type of modelling requires highly efficient computational systems capable of dealing with such levels of data volumes. Computational approximations commonly used in machine learning and data analysis, namely dimensionality reduction and network-based approaches, have been developed with the goal of effectively integrating biomedical data. Among these methods, network-based machine learning stands out due to its major advantage in terms of biomedical interpretability. These methodologies provide a highly intuitive framework for the integration and modelling of biological processes. This PhD thesis aims to explore the potential of integration of complementary available biomedical knowledge with patient-specific data to provide novel computational approaches to solve biomedical scenarios characterized by data scarcity. The primary focus is on studying how high-order graph analysis (i.e., community detection in multiplex and multilayer networks) may help elucidate the interplay of different types of data in contexts where statistical power is heavily impacted by small sample sizes, such as rare diseases and precision oncology. The central focus of this thesis is to illustrate how network biology, among the several data integration approaches with the potential to achieve this task, can play a pivotal role in addressing this challenge provided its advantages in molecular interpretability. Through its insights and methodologies, it introduces how network biology, and in particular, models based on multilayer networks, facilitates bringing the vision of precision medicine to these complex scenarios, providing a natural approach for the discovery of new biomedical relationships that overcomes the difficulties for the study of cohorts presenting limited sample sizes (data-scarce scenarios). Delving into the potential of current artificial intelligence (AI) and network biology applications to address data granularity issues in the precision medicine field, this PhD thesis presents pivotal research works, based on multilayer networks, for the analysis of two rare disease scenarios with specific data granularities, effectively overcoming the classical constraints hindering rare disease and precision oncology research. The first research article presents a personalized medicine study of the molecular determinants of severity in congenital myasthenic syndromes (CMS), a group of rare disorders of the neuromuscular junction (NMJ). The analysis of severity in rare diseases, despite its importance, is typically neglected due to data availability. In this study, modelling of biomedical knowledge via multilayer networks allowed understanding the functional implications of individual mutations in the cohort under study, as well as their relationships with the causal mutations of the disease and the different levels of severity observed. Moreover, the study presents experimental evidence of the role of a previously unsuspected gene in NMJ activity, validating the hypothetical role predicted using the newly introduced methodologies. The second research article focuses on the applicability of multilayer networks for gene priorization. Enhancing concepts for the analysis of different data granularities firstly introduced in the previous article, the presented research provides a methodology based on the persistency of network community structures in a range of modularity resolution, effectively providing a new framework for gene priorization for patient stratification. In summary, this PhD thesis presents major advances on the use of multilayer network-based approaches for the application of precision medicine to data-scarce scenarios, exploring the potential of integrating extensive available biomedical knowledge with patient-specific data

    Quantum photonics at telecom wavelengths based on lithium niobate waveguides

    Get PDF
    International audienceIntegrated optical components on lithium niobate play a major role in standard high-speed communication systems. Over the last two decades, after the birth and positioning of quantum information science, lithium niobate waveguide architectures have emerged as one of the key platforms for enabling photonics quantum technologies. Due to mature technological processes for waveguide structure integration, as well as inherent and efficient properties for nonlinear optical effects, lithium niobate devices are nowadays at the heart of many photon-pair or triplet sources, single-photon detectors, coherent wavelength-conversion interfaces, and quantum memories. Consequently, they find applications in advanced and complex quantum communication systems, where compactness, stability, efficiency, and interconnectability with other guided-wave technologies are required. In this review paper, we first introduce the material aspects of lithium niobate, and subsequently discuss all of the above mentioned quantum components, ranging from standard photon-pair sources to more complex and advanced circuits

    Topological data analysis of organoids

    Get PDF
    Organoids are multi-cellular structures which are cultured in vitro from stem cells to resemble specific organs (e.g., colon, liver) in their three- dimensional composition. The gene expression and the tissue composition of organoids constantly affect each other. Dynamic changes in the shape, cellular composition and transcriptomic profile of these model systems can be used to understand the effect of mutations and treatments in health and disease. In this thesis, I propose new techniques in the field of topological data analysis (TDA) to analyse the gene expression and the morphology of organoids. I use TDA methods, which are inspired by topology, to analyse and quantify the continuous structure of single-cell RNA sequencing data, which is embedded in high dimensional space, and the shape of an organoid. For single-cell RNA sequencing data, I developed the multiscale Laplacian score (MLS) and the UMAP diffusion cover, which both extend and im- prove existing topological analysis methods. I demonstrate the utility of these techniques by applying them to a published benchmark single-cell data set and a data set of mouse colon organoids. The methods validate previously identified genes and detect additional genes with known involvement cancers. To study the morphology of organoids I propose DETECT, a rotationally invariant signature of dynamically changing shapes. I demonstrate the efficacy of this method on a data set of segmented videos of mouse small intestine organoid experiments and show that it outperforms classical shape descriptors. I verify the method on a synthetic organoid data set and illustrate how it generalises to 3D to conclude that DETECT offers rigorous quantification of organoids and opens up computationally scalable methods for distinguishing different growth regimes and assessing treatment effects. Finally, I make a theoretical contribution to the statistical inference of the method underlying DETECT

    Pathogenesis of mitochondrial dysfunction in skeletal muscle

    Get PDF
    PhD ThesisMitochondrial diseases are amongst the most prevalent genetic disorders, however little is known about genetic and cellular mechanisms behind disease pathogenesis and progression. Elucidating such mechanisms can help identify targets for novel therapeutic measures and improve patient care by informing the implementation of clinical regimens and providing clearer information on prognoses. This project aims to improve the understanding of the molecular mechanisms behind the pathogenesis of mitochondrial dysfunction in muscle and the genetic and biochemical changes occurring over time in patients with mitochondrial disease. Firstly, a longitudinal study combines immunofluorescent and molecular genetic techniques to assess biochemical and genetic changes over time in serial skeletal muscle biopsies from patients with m.3243A>G or single, large-scale mtDNA deletions, the two largest groups in the MRC Centre Mitochondrial Disease Patient Cohort. Further investigation into the relationship between the genetic and biochemical defects in patients with single, large-scale mtDNA deletions is carried out by applying a single-fibre approach. Here, muscle fibres are classified by their biochemical defect and laser microdissected for genetic analysis to determine deletion level and mtDNA copy number. These studies find that: (i) changes to mutation level, mtDNA copy number and biochemical defect occur over time in skeletal muscle of mitochondrial disease patients; (ii) these changes are inconsistent in magnitude and direction across groups of patients and (iii) the biochemical threshold for deficiency is affected by the size and location of single, large-scale mtDNA deletions. In addition, a real time PCR assay for the quantification of mitochondrial DNA copy number from homogenate tissue has been optimised to improve accuracy through the use of additional gene markers.The Barbour Foundatio

    Reconstructing networks

    Get PDF
    Complex networks datasets often come with the problem of missing information: interactions data that have not been measured or discovered, may be affected by errors, or are simply hidden because of privacy issues. This Element provides an overview of the ideas, methods and techniques to deal with this problem and that together define the field of network reconstruction. Given the extent of the subject, we shall focus on the inference methods rooted in statistical physics and information theory. The discussion will be organized according to the different scales of the reconstruction task, that is, whether the goal is to reconstruct the macroscopic structure of the network, to infer its mesoscale properties, or to predict the individual microscopic connections.Comment: 107 pages, 25 figure

    Tselluloosi ensümaatilise hüdrolüüsi mehhanismi uurimine madalmolekulaarsete mudelsubstraatide abil

    Get PDF
    Väitekirja elektrooniline versioon ei sisalda publikatsiooneTselluloos on vees lahustumatu biopolümeer, mis koosneb lineaarsetest ahelatest, kus glükoosi jäägid on omavahel ühendatud glükosiidsete sidemetega. Looduses on tselluloos oluline energia- ja süsinikuallikas mitmetele baketri- ja seeneliikidele. Tselluloosi lagundamiseks kasutavad need organismid komplekti erinevatest hüdrolüütilisest ja oksüdatiivsetest ensüümidest, mida kokkuvõtvalt nimetatakse tsellulolüütiliseks süsteemiks. Kõige olulisemad komponendid tsellulolüütilistes süsteemides on tsellobiohüdrolaasid. Neid ensüüme iseloomustab protsessiivsus, see tähendab, et olles seostunud tselluloosiahelale need ensüümid hüdrolüüsivad glükosiidsidemeid järjest edasi liikudes ilma vahepeal dissotsieerumata. Pehmemädanikseen Trichoderma reesei on üks paremini uuritud tselluloosi lagundav organism. Ehkki Trichoderma reesei tsellulaase on uuritud aastakümneid, ei mõisteta tselluloosi hüdrolüüsi lõplikult tänini. Tsellulaasidele ei ole suudetud üheselt määratleda klassikalises ensümoloogias laialdaselt kasutatavat kineetilist parameetrit katalüütiline konstant kcat. Raskused katalüütilise konstandi määramisel on tingitud ühelt poolt substraadist – tselluloos on vees lahustumatu ning sellest tulenevalt ei jaotu ruumis ühtlaselt. Teiselt poolt on see tingitud ensüümist – tsellulaasid on võimelised seostuma tselluloosiga nii produktiivselt kui ka mitteproduktiivselt. Samuti on teada, et tselluloosi hüdrolüüs tsellobiohüdrolaasidega ei järgi klassikalist Michaelis Menteni kineetikat vaid hüdrolüüsi kiirus langeb ajas järsult. Käesolevas doktoritöös töötati välja meetod ensümaatilise tselluloosi hüdrolüüsi katalüütilise konstandi määramiseks. Meetod põhineb mudelsubstraadi hüdrolüüsil ning võimaldab eristada produktiivselt seostunud ensüümi mitteproduktiivsest. Samuti võimaldab see meetod jälgida produktiivselt seostunud ensüümi osakaalu hüdrolüüsi kulgedes ning uurida näiliste kineetiliste parameetrite muutumist ajas. Püstitatud hüpoteesi kohaselt on tsellobiohüdrolaaside kiiruse langus tingitud ensüümimolekulide „kinni jäämisest“ tselluloosi pinnale. Protsessiivsed tsellobiohüdrolaasid liiguvad mööda tselluloosi ahelat kuni nende teele jääb takistus ning nende liikumine peatub. Kuna tsellobiohüdrolaaside dissotsiatsiooni kiirus on madal jäävad ensüümimolekulid takistuse taha pidama ning mitteproduktiivselt seostunud ensüümi oskaal suureneb. Tsellobiohüdrolaaside aktiivsust saab tõsta, kui soodustada mitteproduktiivselt seostunud ensüümide dissotsiatsiooni. Saadud tulemused annavad eelduse tootmaks paremaid ensüümisegusid tselluloosi tööstuslikuks hüdrolüüsiks.Cellulose is a water-insoluble polysaccharide that consists of linear chains of glucose residues. Cellulose is an important energy and carbon source for many species of fungi and bacteria. These organisms secrete a set of hydrolytic and oxidative enzymes also called cellulolytic system. The major components of these cellulolytic systems are cellobiohydrolases. These enzymes are processive which means that enzyme, once bound productively to the substrate, performs several consecutive catalytic steps on a single polysaccharide chain before it dissociates. The best described cellulolytic system is that of the soft rot fungus Trichoderma reesei. While Trichoderma reesei cellulases have been subject of intensive study for decades, the mechanism of cellulase catalyzed cellulose hydrolysis is still not fully understood. One of the biggest shortcomings is the difficulty to measure the rate constant of cellulases acting on cellulose. Problems arise from the heterogeneous insoluble substrate that renders the reaction non-uniform as well as from modular structure of the enzyme that enables both productive and non-productive binding of the enzyme. Also, it is well known that the rate of enzymatic cellulose hydrolysis drops rapidly in time. The initial burst of activity is followed by a rapid decrease in the hydrolysis rate. This work introduces novel methods for determining the rate constants of cellobiohydrolase catalyzed cellulose hydrolysis. These methods are based on the hydrolysis of small soluble model substrates and enable the determination of productively bound enzyme. The methods were used to investigate the mechanism behind the decrease in activity of cellulases during the cellulose hydrolysis. Our hypothesis is that the decline of cellulose hydrolysis rate is caused by obstacles on the path of the processive movement of the enzyme. According to this model, the newly formed productive enzyme-substrate complex moves along the cellulose chain at a constant rate. Once the complex encounters an obstacle it is stalled. Since the dissociation rate constant is low, the enzyme remains “stuck” on the substrate and the proportion of non-productively bound enzyme increases. The action of cellobiohydrolases could be enhanced by promoting recruitment of the enzyme. The results provide a basis for producing better enzyme cocktails for industrial degradation of cellulose

    Juhukõnnid translatsioonis

    Get PDF
    Väitekirja elektrooniline versioon ei sisalda publikatsioone.Poomisvastus on võtmetähtsusega adaptiivsete mehhanismide regulatsioonil, mis aitavad bakteritel ebasoodsaid keskkonnatingimusi üle elada. Soolekepikeses (Escherichia coli) on selles protsessis oluliseks ensüümiks RelA, mis vastusena aminohappenäljale sünteesib signaalmolekuli (p)ppGpp. See signaalmolekul mõjutab transkriptsiooni, translatsiooni ja rakkude jagunemist. Meie töötasime välja ühe molekuli jälgimise mikroskoopia metoodika, mis võimaldab mõõta molekulide difusiooni rakus. Kasutasime seda metoodikat erineva kiirusega liikuvate molekulide kirjeldamiseks. Rakus vabalt difundeeruva valgu näiteks oli fluorestseeruv valk mEos2. Hoopis teistsuguste omadustega valguks osutus mitokondri membraanivalk Tom40, mille liikumine on ühte asukohta piiratud. RelA puhul täheldasime nii vabu, kiirelt difundeeruvaid molekule kui ka ribosoomile seondunud ja seetõttu aeglaselt liikuvaid molekule. Kombineerides ühe molekuli jälgimise tulemusi biokeemiliste andmetega, pakume välja RelA valgu töötsükli mudeli. Kuhjuv (p)ppGpp põhjustab samuti RelA aktivatsiooni. Sellisel viisil tekib positiivse tagasisidestusega regulatsioonisüsteem ja signaalmolekuli kontsentratsioon tõuseb kiiresti.The stringent response plays key role in the activation and regulation of the adaptive mechanisms that bacteria employ in order to accommodate to the adverse conditions. In E.coli the process is governed by the stringent factor RelA, transferring the amino-acid starvation signals by synthesize (p)ppGpp altering cell replication, transcription and translation. We have developed in vivo single-molecule tracking microscopy assay that allows us to track fast and slowly diffusive cytosolic (stringent factor RelA and free GFP variant mEos2) or membrane bound (mitochondrial membrane channel Tom40) proteins. The fluorescently labeled Tom40-Dendra2 complex in the mitochondrial membrane showed highly mobile but confined diffusion properties By combining biochemical and single-molecule microscopy approaches we have suggested different (p)ppGpp synthesizing mechanism from the standard hopping model where many (p)ppGpp molecules are produced upon dissociation of enzymatically active RelA from the ribosome and (p)ppGpp production is directly responsible for enhancement of the RelA enzymatic activity by positive feedback loop acting at the enzymatic level.

    Computational Labeling, Partitioning, and Balancing of Molecular Networks

    Get PDF
    Recent advances in high throughput techniques enable large-scale molecular quantification with high accuracy, including mRNAs, proteins and metabolites. Differential expression of these molecules in case and control samples provides a way to select phenotype-associated molecules with statistically significant changes. However, given the significance ranking list of molecular changes, how those molecules work together to drive phenotype formation is still unclear. In particular, the changes in molecular quantities are insufficient to interpret the changes in their functional behavior. My study is aimed at answering this question by integrating molecular network data to systematically model and estimate the changes of molecular functional behaviors. We build three computational models to label, partition, and balance molecular networks using modern machine learning techniques. (1) Due to the incompleteness of protein functional annotation, we develop AptRank, an adaptive PageRank model for protein function prediction on bilayer networks. By integrating Gene Ontology (GO) hierarchy with protein-protein interaction network, our AptRank outperforms four state-of-the-art methods in a comprehensive evaluation using benchmark datasets. (2) We next extend our AptRank into a network partitioning method, BioSweeper, to identify functional network modules in which molecules share similar functions and also densely connect to each other. Compared to traditional network partitioning methods using only network connections, BioSweeper, which integrates the GO hierarchy, can automatically identify functionally enriched network modules. (3) Finally, we conduct a differential interaction analysis, namely difFBA, on protein-protein interaction networks by simulating protein fluxes using flux balance analysis (FBA). We test difFBA using quantitative proteomic data from colon cancer, and demonstrate that difFBA offers more insights into functional changes in molecular behavior than does protein quantity changes alone. We conclude that our integrative network model increases the observational dimensions of complex biological systems, and enables us to more deeply understand the causal relationships between genotypes and phenotypes
    corecore