229 research outputs found

    An expectation-maximization framework for comprehensive prediction of isoform-specific functions.

    Get PDF
    MOTIVATION: Advances in RNA sequencing technologies have achieved an unprecedented accuracy in the quantification of mRNA isoforms, but our knowledge of isoform-specific functions has lagged behind. There is a need to understand the functional consequences of differential splicing, which could be supported by the generation of accurate and comprehensive isoform-specific gene ontology annotations. RESULTS: We present isoform interpretation, a method that uses expectation-maximization to infer isoform-specific functions based on the relationship between sequence and functional isoform similarity. We predicted isoform-specific functional annotations for 85 617 isoforms of 17 900 protein-coding human genes spanning a range of 17 430 distinct gene ontology terms. Comparison with a gold-standard corpus of manually annotated human isoform functions showed that isoform interpretation significantly outperforms state-of-the-art competing methods. We provide experimental evidence that functionally related isoforms predicted by isoform interpretation show a higher degree of domain sharing and expression correlation than functionally related genes. We also show that isoform sequence similarity correlates better with inferred isoform function than with gene-level function. AVAILABILITY AND IMPLEMENTATION: Source code, documentation, and resource files are freely available under a GNU3 license at https://github.com/TheJacksonLaboratory/isopretEM and https://zenodo.org/record/7594321

    Computational Labeling, Partitioning, and Balancing of Molecular Networks

    Get PDF
    Recent advances in high throughput techniques enable large-scale molecular quantification with high accuracy, including mRNAs, proteins and metabolites. Differential expression of these molecules in case and control samples provides a way to select phenotype-associated molecules with statistically significant changes. However, given the significance ranking list of molecular changes, how those molecules work together to drive phenotype formation is still unclear. In particular, the changes in molecular quantities are insufficient to interpret the changes in their functional behavior. My study is aimed at answering this question by integrating molecular network data to systematically model and estimate the changes of molecular functional behaviors. We build three computational models to label, partition, and balance molecular networks using modern machine learning techniques. (1) Due to the incompleteness of protein functional annotation, we develop AptRank, an adaptive PageRank model for protein function prediction on bilayer networks. By integrating Gene Ontology (GO) hierarchy with protein-protein interaction network, our AptRank outperforms four state-of-the-art methods in a comprehensive evaluation using benchmark datasets. (2) We next extend our AptRank into a network partitioning method, BioSweeper, to identify functional network modules in which molecules share similar functions and also densely connect to each other. Compared to traditional network partitioning methods using only network connections, BioSweeper, which integrates the GO hierarchy, can automatically identify functionally enriched network modules. (3) Finally, we conduct a differential interaction analysis, namely difFBA, on protein-protein interaction networks by simulating protein fluxes using flux balance analysis (FBA). We test difFBA using quantitative proteomic data from colon cancer, and demonstrate that difFBA offers more insights into functional changes in molecular behavior than does protein quantity changes alone. We conclude that our integrative network model increases the observational dimensions of complex biological systems, and enables us to more deeply understand the causal relationships between genotypes and phenotypes

    From condition-specific interactions towards the differential complexome of proteins

    Get PDF
    While capturing the transcriptomic state of a cell is a comparably simple effort with modern sequencing techniques, mapping protein interactomes and complexomes in a sample-specific manner is currently not feasible on a large scale. To understand crucial biological processes, however, knowledge on the physical interplay between proteins can be more interesting than just their mere expression. In this thesis, we present and demonstrate four software tools that unlock the cellular wiring in a condition-specific manner and promise a deeper understanding of what happens upon cell fate transitions. PPIXpress allows to exploit the abundance of existing expression data to generate specific interactomes, which can even consider alternative splicing events when protein isoforms can be related to the presence of causative protein domain interactions of an underlying model. As an addition to this work, we developed the convenient differential analysis tool PPICompare to determine rewiring events and their causes within the inferred interaction networks between grouped samples. Furthermore, we present a new implementation of the combinatorial protein complex prediction algorithm DACO that features a significantly reduced runtime. This improvement facilitates an application of the method for a large number of samples and the resulting sample-specific complexes can ultimately be assessed quantitatively with our novel differential protein complex analysis tool CompleXChange.Das Transkriptom einer Zelle ist mit modernen Sequenzierungstechniken vergleichsweise einfach zu erfassen. Die Ermittlung von Proteininteraktionen und -komplexen wiederum ist in großem Maßstab derzeit nicht möglich. Um wichtige biologische Prozesse zu verstehen, kann das Zusammenspiel von Proteinen jedoch erheblich interessanter sein als deren reine Expression. In dieser Arbeit stellen wir vier Software-Tools vor, die es ermöglichen solche Interaktionen zustandsbezogen zu betrachten und damit ein tieferes Verständnis darüber versprechen, was in der Zelle bei Veränderungen passiert. PPIXpress ermöglicht es vorhandene Expressionsdaten zu nutzen, um die aktiven Interaktionen in einem biologischen Kontext zu ermitteln. Wenn Proteinvarianten mit Interaktionen von Proteindomänen in Verbindung gebracht werden können, kann hierbei sogar alternatives Spleißen berücksichtigen werden. Als Ergänzung dazu haben wir das komfortable Differenzialanalyse-Tool PPICompare entwickelt, welches Veränderungen des Interaktoms und deren Ursachen zwischen gruppierten Proben bestimmen kann. Darüber hinaus stellen wir eine neue Implementierung des Proteinkomplex-Vorhersagealgorithmus DACO vor, die eine deutlich reduzierte Laufzeit aufweist. Diese Verbesserung ermöglicht die Anwendung der Methode auf eine große Anzahl von Proben. Die damit bestimmten probenspezifischen Komplexe können schließlich mit unserem neuartigen Differenzialanalyse-Tool CompleXChange quantitativ bewertet werden

    First-passage times in complex energy landscapes: a case study with nonmuscle myosin II assembly

    Full text link
    Complex energy landscapes often arise in biological systems, e.g. for protein folding, biochemical reactions or intracellular transport processes. Their physical effects are often reflected in the first-passage times arising from these energy landscapes. However, their calculation is notoriously challenging and it is often difficult to identify the most relevant features of a given energy landscape. Here we show how this can be achieved by coarse-graining the Fokker-Planck equation to a master equation and decomposing its first-passage times in an iterative process. We apply this method to the electrostatic interaction between two rods of nonmuscle myosin II (NM2), which is the main molecular motor for force generation in nonmuscle cells. Energy landscapes are computed directly from the amino acid sequences of the three different isoforms. Our approach allows us to identify their most likely arrangements during self-assembly into nonmuscle myosin II minifilaments and how they change under force. In particular, we find that antiparallel configurations are more stable than parallel ones, but also show more changes under mechanical loading. Our work demonstrates the rich dynamics that can be expected for NM2-assemblies under mechanical load and in general shows how one can identify the most relevant energy barriers in complex energy landscapes.Comment: Revtex, 33 pages, 8 figure

    Cell migration and capillary plexus formation in wounds and retinae

    Get PDF
    Cell migration is a fundamental biological phenomenon that is critical to the development and maintenance of tissues in multi-cellular organisms. This thesis presents a series of discrete mathematical models designed to study the migratory response of such cells when exposed to a variety of environmental stimuli. By applying these models to pertinent biological scenarios and benchmarking results against experimental data, novel insights are gained into the underlying cell behaviour. The process of angiogenesis is investigated first and models are developed for simulating capillary plexus expansion during both wound healing and retinal vascular development. The simulated cell migration is coupled to a detailed model of blood perfusion that allows prediction of dynamic flow-induced evolution of the nascent vascular architectures – the network topologies generated in each case are found to successfully reproduce a number of longitudinal experimental metrics. Moreover, in the case of retinal development, the resultant distributions of haematocrit and oxygen are found to be essential in generating vasculatures that resemble those observed in vivo. An alternative cell migration model is then derived that is capable of more accurately describing both individual and collective cell movement. The general model framework, which allows for biophysical cell-cell interactions and adaptive cell morphologies, is seen to have the potential for a range of applications. The value of the modelling approach is well demonstrated by benchmarking in silico cell movement against experimental data from an in vitro fibroblast scrape wound assay. The results subsequently reveal an unexplained discrepancy that provides an intriguing challenge for future studies

    Network-based methods for biological data integration in precision medicine

    Full text link
    [eng] The vast and continuously increasing volume of available biomedical data produced during the last decades opens new opportunities for large-scale modeling of disease biology, facilitating a more comprehensive and integrative understanding of its processes. Nevertheless, this type of modelling requires highly efficient computational systems capable of dealing with such levels of data volumes. Computational approximations commonly used in machine learning and data analysis, namely dimensionality reduction and network-based approaches, have been developed with the goal of effectively integrating biomedical data. Among these methods, network-based machine learning stands out due to its major advantage in terms of biomedical interpretability. These methodologies provide a highly intuitive framework for the integration and modelling of biological processes. This PhD thesis aims to explore the potential of integration of complementary available biomedical knowledge with patient-specific data to provide novel computational approaches to solve biomedical scenarios characterized by data scarcity. The primary focus is on studying how high-order graph analysis (i.e., community detection in multiplex and multilayer networks) may help elucidate the interplay of different types of data in contexts where statistical power is heavily impacted by small sample sizes, such as rare diseases and precision oncology. The central focus of this thesis is to illustrate how network biology, among the several data integration approaches with the potential to achieve this task, can play a pivotal role in addressing this challenge provided its advantages in molecular interpretability. Through its insights and methodologies, it introduces how network biology, and in particular, models based on multilayer networks, facilitates bringing the vision of precision medicine to these complex scenarios, providing a natural approach for the discovery of new biomedical relationships that overcomes the difficulties for the study of cohorts presenting limited sample sizes (data-scarce scenarios). Delving into the potential of current artificial intelligence (AI) and network biology applications to address data granularity issues in the precision medicine field, this PhD thesis presents pivotal research works, based on multilayer networks, for the analysis of two rare disease scenarios with specific data granularities, effectively overcoming the classical constraints hindering rare disease and precision oncology research. The first research article presents a personalized medicine study of the molecular determinants of severity in congenital myasthenic syndromes (CMS), a group of rare disorders of the neuromuscular junction (NMJ). The analysis of severity in rare diseases, despite its importance, is typically neglected due to data availability. In this study, modelling of biomedical knowledge via multilayer networks allowed understanding the functional implications of individual mutations in the cohort under study, as well as their relationships with the causal mutations of the disease and the different levels of severity observed. Moreover, the study presents experimental evidence of the role of a previously unsuspected gene in NMJ activity, validating the hypothetical role predicted using the newly introduced methodologies. The second research article focuses on the applicability of multilayer networks for gene priorization. Enhancing concepts for the analysis of different data granularities firstly introduced in the previous article, the presented research provides a methodology based on the persistency of network community structures in a range of modularity resolution, effectively providing a new framework for gene priorization for patient stratification. In summary, this PhD thesis presents major advances on the use of multilayer network-based approaches for the application of precision medicine to data-scarce scenarios, exploring the potential of integrating extensive available biomedical knowledge with patient-specific data

    Stochastic transport in complex environments : applications in cell biology

    Get PDF
    Living organisms would not be functional without active processes. This general statement is valid down to the cellular level. Transport processes are necessary to create, maintain and support cellular structures. In this thesis, intracellular transport processes, driven by concentration gradients and active matter, as well as the dynamics of migrating cells are studied. Many studies deal with diffusive intracellular transport in the complex environment of neuronal dendrites, however, focusing on a few spines. In this thesis, a model was developed for diffusive transport in a full dendritic tree. A link was established between complex structural changes by diseases and transport characteristics. Furthermore, recent experimental studies of search processes in migration of dendritic cells show a link between speed and persistence. In this thesis, a correlation between them was included in a stochastic model, which lead to increased search efficiency. Finally, this thesis deals with the question of how active, bidirectional transport by molecular motors in axons can be efficient. Generically, traffic jams are expected in confined environments. Limitations of bypassing mechanisms are discussed with a bidirectional non-Markovian exclusion process, developed in this thesis. Experimental findings of cooperative effects and microtubule modifications have been incorporated in a stochastic model, leading to self-organized lane-formation and thus, efficient bidirectional transport.Ohne aktive Prozesse wären lebendige Organismen nicht funktionsfähig. Dies gilt bis herab zur Zellebene. Transportprozesse sind notwendig um zelluläre Strukturen aufzubauen und zu erhalten. In dieser Arbeit werden intrazelluläre Transportprozesse, getrieben von Konzentrationsgradienten und aktiver Materie, sowie die Dynamik in Zellmigration untersucht. Viele Studien beschäftigen sich mit passivem Transport in der komplexen Umgebung von neuronalen Dendriten, vorwiegend jedoch mit einzelnen Dornvortsätzen (spines). In dieser Arbeit wurde ein Modell zu Diffusion in einer vollständigen Dendritenstruktur entwickelt und eine Relation zwischen Krankheitsverläufen und neuronalen Funktionen gefunden. Die Migration von dendritischen Zellen zeigen einen Zusammenhang zwischen ihrer Geschwindigkeit und Persistenz. Dieser wurde in ein stochastisches Modell übernommen welches zeigte, dass die Sucheffizienz der Zellen damit gesteigert werden kann. Außerdem geht es um die Frage wie aktiver, bidirektionaler Transport durch molekulare Motoren in Axonen effizient sein kann. In einem so begrenzten Raum sind Verkehrsstaus zu erwarten. In dieser Arbeit wurden lokale Austauschmechanismen anhand des entwickelten Nicht-Markovschen, bidirektionalen Exklusionsprozess diskutiert. Experimentell entdeckte kooperative Effekte und Mikrotubulimodifikationen wurde in ein stochastisches Modell übernommen, was zu selbstorganisierter Spurbildung und damit zu effizientem bidirektionalem Transport führte

    Energetics of Biological Mechanics and Dynamics

    Get PDF
    Living matter is a class of soft matter systems that maintains itself away from thermodynamic equilibrium by the continual consumption of chemical energy. Indi- vidual proteins consume energy and break detailed balance to drive active force generation by molecular motors, force-dependent binding kinetics, and chemically driven (dis)assembly. These non-equilibrium dynamics propagate across heterogeneous structures to drive essential life processes such as replication, migration, and shape change at the scale of both single cells and multicellular tissues. While much work has been done to understand the molecular processes underlying each individual non-equilibrium behaviors, we lack a general understanding of how the microscopic breaking of detailed balance translates to large-scale cellular behaviors and materials properties.Using the tools of non-equilibrium thermodynamics, this thesis examines this question by measuring energy dissipation during dynamical and mechanical phase transitions seen in experiments, simulations, and theoretical models of biological materials. We choose the actomyosin cytoskeleton, a network composed of polymeric proteins (actin) that are driven away from thermodynamic equilibrium by the activity of molecular motors (myosin), as our model system. Actomyosin contains the three types of non-equilibrium driving we will focus on: force generation, non-equilibrium binding kinetics, and active (dis)assembly. At the subcellular level, analysis of actin filament motions in experiments shows that energy dissipated through bending controls the transition between stable and contractile steady states. Using simulations, we show that non-equilibrium binding kinetics of molecular motors controls a fluid-solid phase transition characterized by thermodynamic quantities with opposite symmetries under time-reversal. At the cellular level, we develop new tools for measuring irreversibility in spatiotemporal dynamics to analyze the energetic costs of oscillations and synchronization of a model biochemical oscillator inspired by (dis)assembly driven actomyosin dynamics. Throughout this thesis, we show that a cell’s distance from equilibrium, quantified by energy dissipation, tunes its mechanical properties and dynamics. This provides a framework to unify disparate biological function through the lens of non-equilibrium thermodynamics
    corecore