451 research outputs found

    Toward Accessible Multilevel Modeling in Systems Biology: A Rule-based Language Concept

    Get PDF
    Promoted by advanced experimental techniques for obtaining high-quality data and the steadily accumulating knowledge about the complexity of life, modeling biological systems at multiple interrelated levels of organization attracts more and more attention recently. Current approaches for modeling multilevel systems typically lack an accessible formal modeling language or have major limitations with respect to expressiveness. The aim of this thesis is to provide a comprehensive discussion on associated problems and needs and to propose a concrete solution addressing them

    Spatio-temporal Dynamics of the Wnt/beta-catenin Signaling Pathway: A Computational Systems Biology Approach

    Get PDF
    The Wnt/β-catenin signaling pathway is involved in human neural progenitor cell differentiation. This dissertation employs the cyclic workflow of computational systems biology to investigate the pathway spatio-temporal dynamics during differentiation. Quantitative in vitro analyses show biphasic kinetics of the pathway proteins. A computational model is developed to investigate in silico these kinetics in correlation with cell cycle and self-induced signaling. We show the importance of stochastic approach and suggest further experiments, hence closing the computational systems biology loop

    Domain-specific languages for modeling and simulation

    Get PDF
    Simulation models and simulation experiments are increasingly complex. One way to handle this complexity is developing software languages tailored to specific application domains, so-called domain-specific languages (DSLs). This thesis explores the potential of employing DSLs in modeling and simulation. We study different DSL design and implementation techniques and illustrate their benefits for expressing simulation models as well as simulation experiments with several examples.Simulationsmodelle und -experimente werden immer komplexer. Eine Möglichkeit, dieser Komplexität zu begegnen, ist, auf bestimmte Anwendungsgebiete spezialisierte Softwaresprachen, sogenannte domänenspezifische Sprachen (\emph{DSLs, domain-specific languages}), zu entwickeln. Die vorliegende Arbeit untersucht, wie DSLs in der Modellierung und Simulation eingesetzt werden können. Wir betrachten verschiedene Techniken für Entwicklung und Implementierung von DSLs und illustrieren ihren Nutzen für das Ausdrücken von Simulationsmodellen und -experimenten anhand einiger Beispiele

    The Attributed Pi Calculus

    Get PDF
    International audienceThe attributed pi calculus (pi(L)) forms an extension of the pi calculus with attributed processes and attribute dependent synchronization. To ensure flexibility, the calculus is parametrized with the language L which defines possible values of attributes. pi(L) can express polyadic synchronization as in pi@ and thus diverse compartment organizations. A non-deterministic and a stochastic semantics, where rates may depend on attribute values, is introduced. The stochastic semantics is based on continuous time Markov chains. A simulation algorithm is developed which is firmly rooted in this stochastic semantics. Two examples, the movement processes in the phototaxis of Euglena and the cooperative binding in the gene regulation of the lambda Phage, underline the applicability of pi(L) to systems biology

    Mathematical models of cellular signaling and supramolecular self-assembly

    Get PDF
    Synthetic biologists endeavor to predict how the increasing complexity of multi-step signaling cascades impacts the fidelity of molecular signaling, whereby cellular state information is often transmitted with proteins diffusing by a pseudo-one-dimensional stochastic process. We address this problem by using a one-dimensional drift-diffusion model to derive an approximate lower bound on the degree of facilitation needed to achieve single-bit informational efficiency in signaling cascades as a function of their length. We find that a universal curve of the Shannon-Hartley form describes the information transmitted by a signaling chain of arbitrary length and depends upon only a small number of physically measurable parameters. This enables our model to be used in conjunction with experimental measurements to aid in the selective design of biomolecular systems. Another important concept in the cellular world is molecular self-assembly. As manipulating the self-assembly of supramolecular and nanoscale constructs at the single-molecule level increasingly becomes the norm, new theoretical scaffolds must be erected to replace the classical thermodynamic and kinetics-based models. The models we propose use state probabilities as its fundamental objects and directly model the transition probabilities between the initial and final states of a trajectory. We leverage these probabilities in the context of molecular self-assembly to compute the overall likelihood that a specified experimental condition leads to a desired structural outcome. We also investigated a larger complex self-assembly system, the heterotypic interactions between amyloid-beta and fatty acids by an independent ensemble kinetic simulation using an underlying differential equation-based system which was validated by biophysical experiments

    The Impact of Large Language Models on Scientific Discovery: a Preliminary Study using GPT-4

    Full text link
    In recent years, groundbreaking advancements in natural language processing have culminated in the emergence of powerful large language models (LLMs), which have showcased remarkable capabilities across a vast array of domains, including the understanding, generation, and translation of natural language, and even tasks that extend beyond language processing. In this report, we delve into the performance of LLMs within the context of scientific discovery, focusing on GPT-4, the state-of-the-art language model. Our investigation spans a diverse range of scientific areas encompassing drug discovery, biology, computational chemistry (density functional theory (DFT) and molecular dynamics (MD)), materials design, and partial differential equations (PDE). Evaluating GPT-4 on scientific tasks is crucial for uncovering its potential across various research domains, validating its domain-specific expertise, accelerating scientific progress, optimizing resource allocation, guiding future model development, and fostering interdisciplinary research. Our exploration methodology primarily consists of expert-driven case assessments, which offer qualitative insights into the model's comprehension of intricate scientific concepts and relationships, and occasionally benchmark testing, which quantitatively evaluates the model's capacity to solve well-defined domain-specific problems. Our preliminary exploration indicates that GPT-4 exhibits promising potential for a variety of scientific applications, demonstrating its aptitude for handling complex problem-solving and knowledge integration tasks. Broadly speaking, we evaluate GPT-4's knowledge base, scientific understanding, scientific numerical calculation abilities, and various scientific prediction capabilities.Comment: 230 pages report; 181 pages for main content

    Research Week 2014

    Get PDF

    Simulation Intelligence: Towards a New Generation of Scientific Methods

    Full text link
    The original "Seven Motifs" set forth a roadmap of essential methods for the field of scientific computing, where a motif is an algorithmic method that captures a pattern of computation and data movement. We present the "Nine Motifs of Simulation Intelligence", a roadmap for the development and integration of the essential algorithms necessary for a merger of scientific computing, scientific simulation, and artificial intelligence. We call this merger simulation intelligence (SI), for short. We argue the motifs of simulation intelligence are interconnected and interdependent, much like the components within the layers of an operating system. Using this metaphor, we explore the nature of each layer of the simulation intelligence operating system stack (SI-stack) and the motifs therein: (1) Multi-physics and multi-scale modeling; (2) Surrogate modeling and emulation; (3) Simulation-based inference; (4) Causal modeling and inference; (5) Agent-based modeling; (6) Probabilistic programming; (7) Differentiable programming; (8) Open-ended optimization; (9) Machine programming. We believe coordinated efforts between motifs offers immense opportunity to accelerate scientific discovery, from solving inverse problems in synthetic biology and climate science, to directing nuclear energy experiments and predicting emergent behavior in socioeconomic settings. We elaborate on each layer of the SI-stack, detailing the state-of-art methods, presenting examples to highlight challenges and opportunities, and advocating for specific ways to advance the motifs and the synergies from their combinations. Advancing and integrating these technologies can enable a robust and efficient hypothesis-simulation-analysis type of scientific method, which we introduce with several use-cases for human-machine teaming and automated science

    Text Mining for Pathway Curation

    Get PDF
    Biolog:innen untersuchen häufig Pathways, Netzwerke von Interaktionen zwischen Proteinen und Genen mit einer spezifischen Funktion. Neue Erkenntnisse über Pathways werden in der Regel zunächst in Publikationen veröffentlicht und dann in strukturierter Form in Lehrbüchern, Datenbanken oder mathematischen Modellen weitergegeben. Deren Kuratierung kann jedoch aufgrund der hohen Anzahl von Publikationen sehr aufwendig sein. In dieser Arbeit untersuchen wir wie Text Mining Methoden die Kuratierung unterstützen können. Wir stellen PEDL vor, ein Machine-Learning-Modell zur Extraktion von Protein-Protein-Assoziationen (PPAs) aus biomedizinischen Texten. PEDL verwendet Distant Supervision und vortrainierte Sprachmodelle, um eine höhere Genauigkeit als vergleichbare Methoden zu erreichen. Eine Evaluation durch Expert:innen bestätigt die Nützlichkeit von PEDLs für Pathway-Kurator:innen. Außerdem stellen wir PEDL+ vor, ein Kommandozeilen-Tool, mit dem auch Nicht-Expert:innen PPAs effizient extrahieren können. Drei Kurator:innen bewerten 55,6 % bis 79,6 % der von PEDL+ gefundenen PPAs als nützlich für ihre Arbeit. Die große Anzahl von PPAs, die durch Text Mining identifiziert werden, kann für Forscher:innen überwältigend sein. Um hier Abhilfe zu schaffen, stellen wir PathComplete vor, ein Modell, das nützliche Erweiterungen eines Pathways vorschlägt. Es ist die erste Pathway-Extension-Methode, die auf überwachtem maschinellen Lernen basiert. Unsere Experimente zeigen, dass PathComplete wesentlich genauer ist als existierende Methoden. Schließlich schlagen wir eine Methode vor, um Pathways mit komplexen Ereignisstrukturen zu erweitern. Hier übertrifft unsere neue Methode zur konditionalen Graphenmodifikation die derzeit beste Methode um 13-24% Genauigkeit in drei Benchmarks. Insgesamt zeigen unsere Ergebnisse, dass Deep Learning basierte Informationsextraktion eine vielversprechende Grundlage für die Unterstützung von Pathway-Kurator:innen ist.Biological knowledge often involves understanding the interactions between molecules, such as proteins and genes, that form functional networks called pathways. New knowledge about pathways is typically communicated through publications and later condensed into structured formats such as textbooks, pathway databases or mathematical models. However, curating updated pathway models can be labour-intensive due to the growing volume of publications. This thesis investigates text mining methods to support pathway curation. We present PEDL (Protein-Protein-Association Extraction with Deep Language Models), a machine learning model designed to extract protein-protein associations (PPAs) from biomedical text. PEDL uses distant supervision and pre-trained language models to achieve higher accuracy than the state of the art. An expert evaluation confirms its usefulness for pathway curators. We also present PEDL+, a command-line tool that allows non-expert users to efficiently extract PPAs. When applied to pathway curation tasks, 55.6% to 79.6% of PEDL+ extractions were found useful by curators. The large number of PPAs identified by text mining can be overwhelming for researchers. To help, we present PathComplete, a model that suggests potential extensions to a pathway. It is the first method based on supervised machine learning for this task, using transfer learning from pathway databases. Our evaluations show that PathComplete significantly outperforms existing methods. Finally, we generalise pathway extension from PPAs to more realistic complex events. Here, our novel method for conditional graph modification outperforms the current best by 13-24% accuracy on three benchmarks. We also present a new dataset for event-based pathway extension. Overall, our results show that deep learning-based information extraction is a promising basis for supporting pathway curators
    corecore