13 research outputs found

    Inference of a Probabilistic Boolean Network from a Single Observed Temporal Sequence

    Get PDF
    The inference of gene regulatory networks is a key issue for genomic signal processing. This paper addresses the inference of probabilistic Boolean networks (PBNs) from observed temporal sequences of network states. Since a PBN is composed of a finite number of Boolean networks, a basic observation is that the characteristics of a single Boolean network without perturbation may be determined by its pairwise transitions. Because the network function is fixed and there are no perturbations, a given state will always be followed by a unique state at the succeeding time point. Thus, a transition counting matrix compiled over a data sequence will be sparse and contain only one entry per line. If the network also has perturbations, with small perturbation probability, then the transition counting matrix would have some insignificant nonzero entries replacing some (or all) of the zeros. If a data sequence is sufficiently long to adequately populate the matrix, then determination of the functions and inputs underlying the model is straightforward. The difficulty comes when the transition counting matrix consists of data derived from more than one Boolean network. We address the PBN inference procedure in several steps: (1) separate the data sequence into "pure" subsequences corresponding to constituent Boolean networks; (2) given a subsequence, infer a Boolean network; and (3) infer the probabilities of perturbation, the probability of there being a switch between constituent Boolean networks, and the selection probabilities governing which network is to be selected given a switch. Capturing the full dynamic behavior of probabilistic Boolean networks, be they binary or multivalued, will require the use of temporal data, and a great deal of it. This should not be surprising given the complexity of the model and the number of parameters, both transitional and static, that must be estimated. In addition to providing an inference algorithm, this paper demonstrates that the data requirement is much smaller if one does not wish to infer the switching, perturbation, and selection probabilities, and that constituent-network connectivity can be discovered with decent accuracy for relatively small time-course sequences

    Gene Regulatory Networks: Modeling, Intervention and Context

    Get PDF
    abstract: Biological systems are complex in many dimensions as endless transportation and communication networks all function simultaneously. Our ability to intervene within both healthy and diseased systems is tied directly to our ability to understand and model core functionality. The progress in increasingly accurate and thorough high-throughput measurement technologies has provided a deluge of data from which we may attempt to infer a representation of the true genetic regulatory system. A gene regulatory network model, if accurate enough, may allow us to perform hypothesis testing in the form of computational experiments. Of great importance to modeling accuracy is the acknowledgment of biological contexts within the models -- i.e. recognizing the heterogeneous nature of the true biological system and the data it generates. This marriage of engineering, mathematics and computer science with systems biology creates a cycle of progress between computer simulation and lab experimentation, rapidly translating interventions and treatments for patients from the bench to the bedside. This dissertation will first discuss the landscape for modeling the biological system, explore the identification of targets for intervention in Boolean network models of biological interactions, and explore context specificity both in new graphical depictions of models embodying context-specific genomic regulation and in novel analysis approaches designed to reveal embedded contextual information. Overall, the dissertation will explore a spectrum of biological modeling with a goal towards therapeutic intervention, with both formal and informal notions of biological context, in such a way that will enable future work to have an even greater impact in terms of direct patient benefit on an individualized level.Dissertation/ThesisPh.D. Computer Science 201

    Stochastic Modeling and Inference of Large-scale Gene Regulatory Networks

    Get PDF
    Gene regulatory networks (GRNs) consist of thousands of genes and proteins which are dynamically interacting with each other. Researchers have investigated how to uncover these unknown interactions by observing expressions of biological molecules with various statistical/mathematical methods. Once these regulatory structures are revealed, it is necessary to understand their dynamical behaviors since pathway activities could be changed by their given conditions. Therefore, both the regulatory structure estimation and dynamics modeling of GRNs are essential for biological research. Generally, GRN dynamics are usually investigated via stochastic models since molecular interactions are basically discrete and stochastic processes. However, this stochastic nature requires heavy simulation time to find the steady-state solution of the GRNs where thousands of genes are involved. This large number of genes also causes difficulties such as dimensionality problem in estimating their regulatory structure. This thesis mainly focuses on developing methodologies for large-scale GRN analyses. It includes applications of a stochastic process theory called G-networks and a reverse engineering technique for large-scale GRNs. Additionally a series of bioinformatics techniques was applied in brain tumor data to detect disease candidate genes along with their large-scale GRNs. The proposed techniques such as stochastic modeling (bottom-up) and reverse engineering (top-down) could provide a systematic view of a complex system and an efficient guideline to identify candidate genes or pathways triggering a specific phenotype of a cell. As further work, the combinatorial use of the modeling and reverse engineering approaches would be helpful in obtaining a reliable mathematical model and even in developing a synthetic biological system

    Optimal Experimental Design in the Context of Objective-Based Uncertainty Quantification

    Get PDF
    In many real-world engineering applications, model uncertainty is inherent. Largescale dynamical systems cannot be perfectly modeled due to systems complexity, lack of enough training data, perturbation, or noise. Hence, it is often of interest to acquire more data through additional experiments to enhance system model. On the other hand, high cost of experiments and limited operational resources make it necessary to devise a cost-effective plan to conduct experiments. In this dissertation, we are concerned with the problem of prioritizing experiments, called experimental design, aimed at uncertainty reduction in dynamical systems. We take an objective-based view where both uncertainty and modeling objective are taken into account for experimental design. To do so, we utilize the concept of mean objective cost of uncertainty to quantify uncertainty. The first part of this dissertation is devoted to the experimental design for gene regulatory networks. Owing to the complexity of these networks, accurate inference is practically challenging. Moreover, from a translational perspective it is crucial that gene regulatory network uncertainty be quantified and reduced in a manner that pertains to the additional cost of network intervention that it induces. We propose a criterion to rank potential experiments based on the concept of mean objective cost of uncertainty. To lower the computational cost of the experimental design, we also propose a network reduction scheme by introducing a novel cost function that takes into account the disruption in the ranking of potential experiments caused by gene deletion. We investigate the performance of both the optimal and the approximate experimental design methods on synthetic and real gene regulatory networks. In the second part, we turn our attention to canonical expansions. Canonical expansions are convenient representations that can facilitate the study of random processes. We discuss objective-based experimental design in the context of canonical expansions for three major applications: filtering, signal detection, and signal compression. We present the general experimental design framework for linear filtering and specifically solve it for Wiener filtering. Then we focus on Karhunen-Loève expansion to study experimental design for signal detection and signal compression applications when the noise variance and the signal covariance matrix are unknown, respectively. In particular, we find the closed-form solution for the intrinsically Bayesian robust Karhunen-Loève compression which is required for the experimental design in the case of signal compression

    Functional analysis of High-Throughput data for dynamic modeling in eukaryotic systems

    Get PDF
    Das Verhalten Biologischer Systeme wird durch eine Vielzahl regulatorischer Prozesse beeinflusst, die sich auf verschiedenen Ebenen abspielen. Die Forschung an diesen Regulationen hat stark von den großen Mengen von Hochdurchsatzdaten profitiert, die in den letzten Jahren verfügbar wurden. Um diese Daten zu interpretieren und neue Erkenntnisse aus ihnen zu gewinnen, hat sich die mathematische Modellierung als hilfreich erwiesen. Allerdings müssen die Daten vor der Integration in Modelle aggregiert und analysiert werden. Wir präsentieren vier Studien auf unterschiedlichen zellulären Ebenen und in verschiedenen Organismen. Zusätzlich beschreiben wir zwei Computerprogramme die den Vergleich zwischen Modell und Experimentellen Daten erleichtern. Wir wenden diese Programme in zwei Studien über die MAP Kinase (MAP, engl. mitogen-acticated-protein) Signalwege in Saccharomyces cerevisiae an, um Modellalternativen zu generieren und unsere Vorstellung des Systems an Daten anzupassen. In den zwei verbleibenden Studien nutzen wir bioinformatische Methoden, um Hochdurchsatz-Zeitreihendaten von Protein und mRNA Expression zu analysieren. Um die Daten interpretieren zu können kombinieren wir sie mit Netzwerken und nutzen Annotationen um Module identifizieren, die ihre Expression im Lauf der Zeit ändern. Im Fall der humanen somatischen Zell Reprogrammierung führte diese Analyse zu einem probabilistischen Boolschen Modell des Systems, welches wir nutzen konnten um neue Hypothesen über seine Funktionsweise aufzustellen. Bei der Infektion von Säugerzellen (Canis familiaris) mit dem Influenza A Virus konnten wir neue Verbindungen zwischen dem Virus und seinem Wirt herausfinden und unsere Zeitreihendaten in bestehende Netzwerke einbinden. Zusammenfassend zeigen viele unserer Ergebnisse die Wichtigkeit von Datenintegration in mathematische Modelle, sowie den hohen Grad der Verschaltung zwischen verschiedenen Regulationssystemen.The behavior of all biological systems is governed by numerous regulatory mechanisms, acting on different levels of time and space. The study of these regulations has greatly benefited from the immense amount of data that has become available from high-throughput experiments in recent years. To interpret this mass of data and gain new knowledge about studied systems, mathematical modeling has proven to be an invaluable method. Nevertheless, before data can be integrated into a model it needs to be aggregated, analyzed, and the most important aspects need to be extracted. We present four Systems Biology studies on different cellular organizational levels and in different organisms. Additionally, we describe two software applications that enable easy comparison of data and model results. We use these in two of our studies on the mitogen-activated-protein (MAP) kinase signaling in Saccharomyces cerevisiae to generate model alternatives and adapt our representation of the system to biological data. In the two remaining studies we apply Bioinformatic methods to analyze two high-throughput time series on proteins and mRNA expression in mammalian cells. We combine the results with network data and use annotations to identify modules and pathways that change in expression over time to be able to interpret the datasets. In case of the human somatic cell reprogramming (SCR) system this analysis leads to the generation of a probabilistic Boolean model which we use to generate new hypotheses about the system. In the last system we examined, the infection of mammalian (Canis familiaris) cells by the influenza A virus, we find new interconnections between host and virus and are able to integrate our data with existing networks. In summary, many of our findings show the importance of data integration into mathematical models and the high degree of connectivity between different levels of regulation

    Validación de modelos genéticos en bioinformática: implementación y visualización

    Get PDF
    Programa de Doctorado en Biotecnología, Ingeniería y Tecnología QuímicaLínea de Investigación: Ingeniería, Ciencia de Datos y BioinformáticaClave Programa: DBICódigo Línea: 111Since the human genome was completely sequenced for the first time, the great scientific and technological advances in the biotechnology industry have greatly reduced the cost of experiments while significantly improving results. This has led to an exponential growth in the biological information available and, due to this huge amount of information, researchers are faced with mountains of data with only flakes of knowledge. Approaches as Knowledge Database Discovery (KDD) are used to generate models that allows researcher to gain knowledge about complex biological systems. Gene networks arose as a straightforward way of representing gene sets including their interactions. They are presented as a network structure where each node represents a gene or gene product (protein) while each edge denotes the relationship between the nodes at its ends. The concrete nature of each relationship and the meaning of its weight depend on the network architecture and the inference algorithm used. A gene network is an abstraction that facilitates the study of its underlying biological system. They are easy to visualize, and they are informative on their own. Gene networks have been successfully used in clinical diagnosis and a large number of inferred interactions have been confirmed experimentally, thus confirming their reliability. The inference of gene networks has also allowed a better understanding of fundamental processes that occur in living organisms such as development or nutrition and metabolic coordination. Research has focused on inferring these networks using different experimental and computational techniques, as well as analyzing those networks to extract knowledge. Also, a significant number of methods have been developed to validate the inferred networks in order to verify their quality and reliability. All the methodologies of gene network inference, analysis, and validation are based on algorithms and computer tools. Given the increasing importance and popularity of these computational approaches, it becomes increasingly critical to ensure that the software is usable and accessible, as these features provide the basis for the reproducibility of published biomedical research. Based on the existing need for automatic techniques of inference, analysis and validation of models for the study of interactions between genes and the deficiencies in existing techniques, this work presents different novel approaches for the inference, analysis and validation of genetic models, especially gene networks, with a special emphasis on the usability and accessibility of the proposed solutions.Universidad Pablo de Olavide de Sevilla. Escuela de Doctorad

    An Initial Framework Assessing the Safety of Complex Systems

    Get PDF
    Trabajo presentado en la Conference on Complex Systems, celebrada online del 7 al 11 de diciembre de 2020.Atmospheric blocking events, that is large-scale nearly stationary atmospheric pressure patterns, are often associated with extreme weather in the mid-latitudes, such as heat waves and cold spells which have significant consequences on ecosystems, human health and economy. The high impact of blocking events has motivated numerous studies. However, there is not yet a comprehensive theory explaining their onset, maintenance and decay and their numerical prediction remains a challenge. In recent years, a number of studies have successfully employed complex network descriptions of fluid transport to characterize dynamical patterns in geophysical flows. The aim of the current work is to investigate the potential of so called Lagrangian flow networks for the detection and perhaps forecasting of atmospheric blocking events. The network is constructed by associating nodes to regions of the atmosphere and establishing links based on the flux of material between these nodes during a given time interval. One can then use effective tools and metrics developed in the context of graph theory to explore the atmospheric flow properties. In particular, Ser-Giacomi et al. [1] showed how optimal paths in a Lagrangian flow network highlight distinctive circulation patterns associated with atmospheric blocking events. We extend these results by studying the behavior of selected network measures (such as degree, entropy and harmonic closeness centrality)at the onset of and during blocking situations, demonstrating their ability to trace the spatio-temporal characteristics of these events.This research was conducted as part of the CAFE (Climate Advanced Forecasting of sub-seasonal Extremes) Innovative Training Network which has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 813844

    Tools and Algorithms for the Construction and Analysis of Systems

    Get PDF
    This open access two-volume set constitutes the proceedings of the 27th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2021, which was held during March 27 – April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The total of 41 full papers presented in the proceedings was carefully reviewed and selected from 141 submissions. The volume also contains 7 tool papers; 6 Tool Demo papers, 9 SV-Comp Competition Papers. The papers are organized in topical sections as follows: Part I: Game Theory; SMT Verification; Probabilities; Timed Systems; Neural Networks; Analysis of Network Communication. Part II: Verification Techniques (not SMT); Case Studies; Proof Generation/Validation; Tool Papers; Tool Demo Papers; SV-Comp Tool Competition Papers

    The interactive ecology of construal in gesture: a microethnographic analysis of peer learning at an EMI university in China

    Get PDF
    Depictive manual gestures do not appear in isolation, but are motivated by a complex of experiential knowledge, communicative goals, and contextual-environmental factors (Harrison 2018; Kendon 2004; Müller 2014; Streeck 1993, 1994, 2009b). However, little is known about the incremental, moment-by-moment formulation of depictions in elaborate sequences of talk. Furthermore, questions endure about depiction as a learning resource within the contingent interactivity of the foreign language academic classroom. This study explores these questions in the context of subject-related student talk at a Sino-foreign university in China by focusing on how gesturers build expositions through intercorporeal and intersubjective sense making (cf. Merleau-Ponty 1945/2012). Drawing on empirical material from the corpus of Chinese Academic Written and Spoken English (CAWSE), I aim to contribute greater understanding of the intersubjective ecology of depictive gesturing. The study builds on previous research on depictive gestures in the classroom (e.g. Rosborough 2014; Roth & Lawless 2002) by focusing on sequences of gesturing within two distinct classroom tasks: i) dialogic explanations of complex systems and ii) interactional multi-party group discussions. By converging theories of intersubjectivity drawing on Cognitive Grammar (e.g. Langacker 2008; Blomberg & Zlatev 2014) and Conversation Analysis (Heritage & Atkinson 1984; Schegloff 1992), I use microethnography for the investigation of gesture as a cognitive practice (Streeck 2009b; cf. Erickson 1995; Streeck & Mehus 2005). The analysis engages concepts in phenomenology, ecological cognition and enactivism in order to illustrate the publicly displayable achievement of enactive construal in spoken exposition. These analyses expose the ways that speakers depict for intersubjective visualization of the topic-at-hand, and anticipate and react to affordances that occur within the landscape of interaction. Speakers design their depictions, by manipulating construal dimensions in three ways: i) depictions are integrated into the exposition for projecting and delimiting epistemic arenas where construal relations are tailored for specific structural aspects of the depictions, ii) depictions invite participatory frameworks for co-analysis of the topic-at-hand, and iii) speakers refashion their depictions to anticipate previous trouble. Furthermore, the analysis of the interactional order of the tasks illustrates the intercorporeality, the pre-reflective disposition towards sense-making, of construal in the moment-by-moment construction of academic classroom talk. This study has implications that problematize the notion of the body as a communicative resource by obscuring the notions of planning and strategy. Overall, the analysis shows that explanations and discussions involve finely grained attenuation of the corporeal dimensions of spoken language
    corecore